Skip to content

Conversation

@gvrose8192
Copy link

@gvrose8192 gvrose8192 commented Oct 21, 2024

Latest work on the FIPS 8 compliant kernel Various linked tasks:
VULN-429
VULN-4095
VULN-597
SECO-169
SECO-94

Kernel selftests have passed https://github.com/user-attachments/files/17512330/kernel-selftest.log with no change in results from previous run.

I've been running netfilter tests in a loop overnight - for i in {1..50000}; do sudo valgrind --log-file=valgrind-results$i.log ./run-tests.sh; done
Typical output, unchanged over any number runs.
nftables-test.log

Valgrind out put from any of hundreds of runs is all the same -
valgrind-results.log

With lockdep enabled and running sudo stress --cpu 28 --io 28 --vm 28 --vm-bytes 1G --timeout 3h I ran multiple passes of the nftables tests with valgrind: for i in {1..4}; do sudo valgrind --log-file=valgrind-results$i.log ./run-tests.sh; done

The nftables tests all pass with no difference from the original tests. Valgrind logs here:
valgrind-results1.log
valgrind-results2.log
valgrind-results3.log
valgrind-results4.log

No lockdep splats, any other splats, no OOMs, no panics, nor any other error messages during the run.

After pulling in the missing patch "netfilter: nf_tables: set backend .flush always succeeds" I reran the netfilter tests overnight with lockdep and kmemleak enabled in the kernel and running "sudo stress --cpu 28 --io 28 --vm 28 --vm-bytes 1G --timeout 8h" and found no issues. The logfiles are unchanged from the previous night's runs.

@gvrose8192 gvrose8192 requested a review from PlaidCat October 21, 2024 21:12
@gvrose8192
Copy link
Author

The 2nd commit is crap - the code is right but whacked the commit message metadata. I'll fix that up. Leaving the rest for your review.

@gvrose8192 gvrose8192 force-pushed the gvrose_fips-legacy-8-compliant/4.18.0-425.13.1 branch from 22f98fe to 3f2fa28 Compare October 21, 2024 21:23
@gvrose8192
Copy link
Author

The 2nd commit is crap - the code is right but whacked the commit message metadata. I'll fix that up. Leaving the rest for your review.

OK, fixed with a force push.

@gvrose8192 gvrose8192 force-pushed the gvrose_fips-legacy-8-compliant/4.18.0-425.13.1 branch 2 times, most recently from 8f50770 to 9cff1f7 Compare October 23, 2024 00:31
@gvrose8192
Copy link
Author

This PR is now ready for full review.
CVE's addressed by this PR:
CVE-2023-4244
CVE-2023-52581
CVE-2024-26925

Github actions build checks here:
https://github.com/ctrliq/kernel-src-tree/actions/runs/11505520259 Checked the PR for valid commit messages
https://github.com/ctrliq/kernel-src-tree/actions/runs/11505519609 Checked the compile/build for x86_64
https://github.com/ctrliq/kernel-src-tree/actions/runs/11505519601 Checked the compile/build for aarch64 - this is not really valid for fips8, but it demonstrates the code changes are portable.

Kernel selftest log shows no new errors or consistent discrepancies from the base kernel from before this PR. I.E things that failed before still fail, things that passed before, still pass.

kernel-selftest.log

What remains to do:

  1. This will require some close inspection of those commits marked with an 'upstream-diff' tag.
  2. Netfilter testsuite - will run it against the current fips-compliant8 branch and then against the same branch with this PR. Checking for no new errors. If something passes that didn't used to then that's great and I will note it in the PR conversation.

@PlaidCat This PR is ready for review. I'll be configuring and running the netfilter tests in parallel and record the results here.

@gvrose8192
Copy link
Author

Oh - totally forgot about this included patch: 5647beb

So that adds an additional CVE fixed by this PR - CVE-2024-39502

So we have 4 total CVEs addressed by this PR, not 3.

@gvrose8192
Copy link
Author

nft-test-results.log

Sample results from an nftables testsuite run on my dev system running Rocky 9.4. This was a sanity check to make sure I could actually build and install the nftables testsuite available here: https://git.netfilter.org/nftables/

Next step is collect the results from a run with the currently available fips8-compliant kernel and compare to the results when I run the kernel built from this PR.

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: previous version invalid do to wrong reference

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: previous version invalid do to wrong reference

@gvrose8192
Copy link
Author

"in nf_tables_commit under case NFT_MSG_NEWSETELEM: also uses nft_setelem_remove
https://github.com/ctrliq/kernel-src-tree/blob/centos_kernel-4.18.0-534.el8/net/netfilter/nf_tables_api.c#L8341

Same as above in __nf_tables_abort NFT_MSG_NEWSETELEM
https://github.com/ctrliq/kernel-src-tree/blob/centos_kernel-4.18.0-534.el8/net/netfilter/nf_tables_api.c#L8547"

Acked - requires more investigation.

@gvrose8192 gvrose8192 force-pushed the gvrose_fips-legacy-8-compliant/4.18.0-425.13.1 branch from 35ee0b2 to c7185b5 Compare October 26, 2024 18:18
@gvrose8192
Copy link
Author

Closing this pull request - will post an updated PR

@gvrose8192 gvrose8192 closed this Oct 28, 2024
@gvrose8192 gvrose8192 reopened this Oct 28, 2024
@gvrose8192 gvrose8192 force-pushed the gvrose_fips-legacy-8-compliant/4.18.0-425.13.1 branch 2 times, most recently from e82f39b to 240a26d Compare October 28, 2024 20:29
@gvrose8192
Copy link
Author

All kernel selftests continue to show no new errors or consistent discrepancies between the base version and with this patch series.
I am running the nftables testing run-tests.sh in continuous loop with valgrind. No memory leaks detected so far after hundreds of loops. The logs are too big to store but I ran a single loops manually and got the following results:

test-results.log
valgrind-results.log

I'll resume valgrind checking of the nftables nfct checks for an overnight run, make sure no long term (within a day) damage is found and to increase confidence in the PR.

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout 240a26d

it should be jira VULN-835

@gvrose8192
Copy link
Author

For netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout 240a26d

it should be jira VULN-835

Good catch - I wondered about pulling that in with a different jira and meant to ask you but then got distracted by other work. I'll fix that up.

@gvrose8192
Copy link
Author

"in nf_tables_commit under case NFT_MSG_NEWSETELEM: also uses nft_setelem_remove https://github.com/ctrliq/kernel-src-tree/blob/centos_kernel-4.18.0-534.el8/net/netfilter/nf_tables_api.c#L8341

Same as above in __nf_tables_abort NFT_MSG_NEWSETELEM https://github.com/ctrliq/kernel-src-tree/blob/centos_kernel-4.18.0-534.el8/net/netfilter/nf_tables_api.c#L8547"

Acked - requires more investigation.

OK, yes. Found and fixed - just missed it in an otherwise large commit. Fix incoming with next branch force push.

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subsystem-sync netfilter:nf_tables 4.18.0-553
These should be:
subsystem-sync netfilter:nf_tables 4.18.0-534

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netfilter: nf_tables: fix table flag updates 95b04bb

Is the upstream-diff here just contextual information in the fuzz?

github-actions bot pushed a commit that referenced this pull request Oct 23, 2025
JIRA: https://issues.redhat.com/browse/RHEL-116007

commit 387602d
Author: Robert Hodaszi <robert.hodaszi@digi.com>
Date: Thu, 3 Apr 2025 16:40:04 +0200

  Don't set WDM_READ flag in wdm_in_callback() for ZLP-s, otherwise when
  userspace tries to poll for available data, it might - incorrectly -
  believe there is something available, and when it tries to non-blocking
  read it, it might get stuck in the read loop.

  For example this is what glib does for non-blocking read (briefly):

    1. poll()
    2. if poll returns with non-zero, starts a read data loop:
      a. loop on poll() (EINTR disabled)
      b. if revents was set, reads data
        I. if read returns with EINTR or EAGAIN, goto 2.a.
        II. otherwise return with data

  So if ZLP sets WDM_READ (#1), we expect data, and try to read it (#2).
  But as that was a ZLP, and we are doing non-blocking read, wdm_read()
  returns with EAGAIN (#2.b.I), so loop again, and try to read again
  (#2.a.).

  With glib, we might stuck in this loop forever, as EINTR is disabled
  (#2.a).

  Signed-off-by: Robert Hodaszi <robert.hodaszi@digi.com>
  Acked-by: Oliver Neukum <oneukum@suse.com>
  Link: https://lore.kernel.org/r/20250403144004.3889125-1-robert.hodaszi@digi.com
  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Desnes Nunes <desnesn@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 23, 2025
JIRA: https://issues.redhat.com/browse/RHEL-116007

commit cf02334
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Thu, 5 Jun 2025 14:25:49 +0300

  If the PHY driver uses another PHY internally (e.g. in case of eUSB2,
  repeaters are represented as PHYs), then it would trigger the following
  lockdep splat because all PHYs use a single static lockdep key and thus
  lockdep can not identify whether there is a dependency or not and
  reports a false positive.

  Make PHY subsystem use dynamic lockdep keys, assigning each driver a
  separate key. This way lockdep can correctly identify dependency graph
  between mutexes.

   ============================================
   WARNING: possible recursive locking detected
   6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 Not tainted
   --------------------------------------------
   kworker/u51:0/78 is trying to acquire lock:
   ffff0008116554f0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c

   but task is already holding lock:
   ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c

   other info that might help us debug this:
    Possible unsafe locking scenario:

          CPU0
          ----
     lock(&phy->mutex);
     lock(&phy->mutex);

    *** DEADLOCK ***

    May be due to missing lock nesting notation

   4 locks held by kworker/u51:0/78:
    #0: ffff000800010948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x18c/0x5ec
    #1: ffff80008036bdb0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1b4/0x5ec
    #2: ffff0008094ac8f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x188
    #3: ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c

   stack backtrace:
   CPU: 0 UID: 0 PID: 78 Comm: kworker/u51:0 Not tainted 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 PREEMPT
   Hardware name: Qualcomm CRD, BIOS 6.0.240904.BOOT.MXF.2.4-00528.1-HAMOA-1 09/ 4/2024
   Workqueue: events_unbound deferred_probe_work_func
   Call trace:
    show_stack+0x18/0x24 (C)
    dump_stack_lvl+0x90/0xd0
    dump_stack+0x18/0x24
    print_deadlock_bug+0x258/0x348
    __lock_acquire+0x10fc/0x1f84
    lock_acquire+0x1c8/0x338
    __mutex_lock+0xb8/0x59c
    mutex_lock_nested+0x24/0x30
    phy_init+0x4c/0x12c
    snps_eusb2_hsphy_init+0x54/0x1a0
    phy_init+0xe0/0x12c
    dwc3_core_init+0x450/0x10b4
    dwc3_core_probe+0xce4/0x15fc
    dwc3_probe+0x64/0xb0
    platform_probe+0x68/0xc4
    really_probe+0xbc/0x298
    __driver_probe_device+0x78/0x12c
    driver_probe_device+0x3c/0x160
    __device_attach_driver+0xb8/0x138
    bus_for_each_drv+0x84/0xe0
    __device_attach+0x9c/0x188
    device_initial_probe+0x14/0x20
    bus_probe_device+0xac/0xb0
    deferred_probe_work_func+0x8c/0xc8
    process_one_work+0x208/0x5ec
    worker_thread+0x1c0/0x368
    kthread+0x14c/0x20c
    ret_from_fork+0x10/0x20

  Fixes: 3584f63 ("phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater")
  Fixes: e246355 ("phy: amlogic: Add Amlogic AXG PCIE PHY Driver")
  Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
  Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
  Reported-by: Johan Hovold <johan+linaro@kernel.org>
  Link: https://lore.kernel.org/lkml/ZnpoAVGJMG4Zu-Jw@hovoldconsulting.com/
  Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
  Tested-by: Johan Hovold <johan+linaro@kernel.org>
  Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
  Link: https://lore.kernel.org/r/20250605-phy-subinit-v3-1-1e1e849e10cd@oss.qualcomm.com
  Signed-off-by: Vinod Koul <vkoul@kernel.org>

Signed-off-by: Desnes Nunes <desnesn@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 23, 2025
JIRA: https://issues.redhat.com/browse/RHEL-115582
Upstream Status: commit 17ce3e5

commit 17ce3e5
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date:   Tue Jul 22 22:40:37 2025 +0000

    bpf: Disable migration in nf_hook_run_bpf().

    syzbot reported that the netfilter bpf prog can be called without
    migration disabled in xmit path.

    Then the assertion in __bpf_prog_run() fails, triggering the splat
    below. [0]

    Let's use bpf_prog_run_pin_on_cpu() in nf_hook_run_bpf().

    [0]:
    BUG: assuming non migratable context at ./include/linux/filter.h:703
    in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 5829, name: sshd-session
    3 locks held by sshd-session/5829:
     #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1667 [inline]
     #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x20/0x50 net/ipv4/tcp.c:1395
     #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
     #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
     #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x69/0x26c0 net/ipv4/ip_output.c:470
     #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
     #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
     #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: nf_hook+0xb2/0x680 include/linux/netfilter.h:241
    CPU: 0 UID: 0 PID: 5829 Comm: sshd-session Not tainted 6.16.0-rc6-syzkaller-00002-g155a3c003e55 #0 PREEMPT(full)
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
     __cant_migrate kernel/sched/core.c:8860 [inline]
     __cant_migrate+0x1c7/0x250 kernel/sched/core.c:8834
     __bpf_prog_run include/linux/filter.h:703 [inline]
     bpf_prog_run include/linux/filter.h:725 [inline]
     nf_hook_run_bpf+0x83/0x1e0 net/netfilter/nf_bpf_link.c:20
     nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline]
     nf_hook_slow+0xbb/0x200 net/netfilter/core.c:623
     nf_hook+0x370/0x680 include/linux/netfilter.h:272
     NF_HOOK_COND include/linux/netfilter.h:305 [inline]
     ip_output+0x1bc/0x2a0 net/ipv4/ip_output.c:433
     dst_output include/net/dst.h:459 [inline]
     ip_local_out net/ipv4/ip_output.c:129 [inline]
     __ip_queue_xmit+0x1d7d/0x26c0 net/ipv4/ip_output.c:527
     __tcp_transmit_skb+0x2686/0x3e90 net/ipv4/tcp_output.c:1479
     tcp_transmit_skb net/ipv4/tcp_output.c:1497 [inline]
     tcp_write_xmit+0x1274/0x84e0 net/ipv4/tcp_output.c:2838
     __tcp_push_pending_frames+0xaf/0x390 net/ipv4/tcp_output.c:3021
     tcp_push+0x225/0x700 net/ipv4/tcp.c:759
     tcp_sendmsg_locked+0x1870/0x42b0 net/ipv4/tcp.c:1359
     tcp_sendmsg+0x2e/0x50 net/ipv4/tcp.c:1396
     inet_sendmsg+0xb9/0x140 net/ipv4/af_inet.c:851
     sock_sendmsg_nosec net/socket.c:712 [inline]
     __sock_sendmsg net/socket.c:727 [inline]
     sock_write_iter+0x4aa/0x5b0 net/socket.c:1131
     new_sync_write fs/read_write.c:593 [inline]
     vfs_write+0x6c7/0x1150 fs/read_write.c:686
     ksys_write+0x1f8/0x250 fs/read_write.c:738
     do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
     do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7fe7d365d407
    Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
    RSP:

    Fixes: fd9c663 ("bpf: minimal support for programs hooked into netfilter framework")
    Reported-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/all/6879466d.a00a0220.3af5df.0022.GAE@google.com/
    Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Tested-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
    Acked-by: Florian Westphal <fw@strlen.de>
    Link: https://patch.msgid.link/20250722224041.112292-1-kuniyu@google.com

Signed-off-by: Florian Westphal <fwestpha@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 23, 2025
JIRA: https://issues.redhat.com/browse/RHEL-78203

commit ee684de
Author: Viktor Malik <vmalik@redhat.com>
Date:   Tue Apr 15 17:50:14 2025 +0200

    libbpf: Fix buffer overflow in bpf_object__init_prog
    
    As shown in [1], it is possible to corrupt a BPF ELF file such that
    arbitrary BPF instructions are loaded by libbpf. This can be done by
    setting a symbol (BPF program) section offset to a large (unsigned)
    number such that <section start + symbol offset> overflows and points
    before the section data in the memory.
    
    Consider the situation below where:
    - prog_start = sec_start + symbol_offset    <-- size_t overflow here
    - prog_end   = prog_start + prog_size
    
        prog_start        sec_start        prog_end        sec_end
            |                |                 |              |
            v                v                 v              v
        .....................|################################|............
    
    The report in [1] also provides a corrupted BPF ELF which can be used as
    a reproducer:
    
        $ readelf -S crash
        Section Headers:
          [Nr] Name              Type             Address           Offset
               Size              EntSize          Flags  Link  Info  Align
        ...
          [ 2] uretprobe.mu[...] PROGBITS         0000000000000000  00000040
               0000000000000068  0000000000000000  AX       0     0     8
    
        $ readelf -s crash
        Symbol table '.symtab' contains 8 entries:
           Num:    Value          Size Type    Bind   Vis      Ndx Name
        ...
             6: ffffffffffffffb8   104 FUNC    GLOBAL DEFAULT    2 handle_tp
    
    Here, the handle_tp prog has section offset ffffffffffffffb8, i.e. will
    point before the actual memory where section 2 is allocated.
    
    This is also reported by AddressSanitizer:
    
        =================================================================
        ==1232==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7c7302fe0000 at pc 0x7fc3046e4b77 bp 0x7ffe64677cd0 sp 0x7ffe64677490
        READ of size 104 at 0x7c7302fe0000 thread T0
            #0 0x7fc3046e4b76 in memcpy (/lib64/libasan.so.8+0xe4b76)
            #1 0x00000040df3e in bpf_object__init_prog /src/libbpf/src/libbpf.c:856
            #2 0x00000040df3e in bpf_object__add_programs /src/libbpf/src/libbpf.c:928
            #3 0x00000040df3e in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3930
            #4 0x00000040df3e in bpf_object_open /src/libbpf/src/libbpf.c:8067
            #5 0x00000040f176 in bpf_object__open_file /src/libbpf/src/libbpf.c:8090
            #6 0x000000400c16 in main /poc/poc.c:8
            #7 0x7fc3043d25b4 in __libc_start_call_main (/lib64/libc.so.6+0x35b4)
            #8 0x7fc3043d2667 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x3667)
            #9 0x000000400b34 in _start (/poc/poc+0x400b34)
    
        0x7c7302fe0000 is located 64 bytes before 104-byte region [0x7c7302fe0040,0x7c7302fe00a8)
        allocated by thread T0 here:
            #0 0x7fc3046e716b in malloc (/lib64/libasan.so.8+0xe716b)
            #1 0x7fc3045ee600 in __libelf_set_rawdata_wrlock (/lib64/libelf.so.1+0xb600)
            #2 0x7fc3045ef018 in __elf_getdata_rdlock (/lib64/libelf.so.1+0xc018)
            #3 0x00000040642f in elf_sec_data /src/libbpf/src/libbpf.c:3740
    
    The problem here is that currently, libbpf only checks that the program
    end is within the section bounds. There used to be a check
    `while (sec_off < sec_sz)` in bpf_object__add_programs, however, it was
    removed by commit 6245947 ("libbpf: Allow gaps in BPF program
    sections to support overriden weak functions").
    
    Add a check for detecting the overflow of `sec_off + prog_sz` to
    bpf_object__init_prog to fix this issue.
    
    [1] https://github.com/lmarch2/poc/blob/main/libbpf/libbpf.md
    
    Fixes: 6245947 ("libbpf: Allow gaps in BPF program sections to support overriden weak functions")
    Reported-by: lmarch2 <2524158037@qq.com>
    Signed-off-by: Viktor Malik <vmalik@redhat.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
    Link: https://github.com/lmarch2/poc/blob/main/libbpf/libbpf.md
    Link: https://lore.kernel.org/bpf/20250415155014.397603-1-vmalik@redhat.com

Signed-off-by: Viktor Malik <vmalik@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 24, 2025
JIRA: https://issues.redhat.com/browse/RHEL-112997

commit ffa1e7a
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Date:   Tue Mar 18 10:55:48 2025 +0100

    block: Make request_queue lockdep splats show up earlier

    In recent kernels, there are lockdep splats around the
    struct request_queue::io_lockdep_map, similar to [1], but they
    typically don't show up until reclaim with writeback happens.

    Having multiple kernel versions released with a known risc of kernel
    deadlock during reclaim writeback should IMHO be addressed and
    backported to -stable with the highest priority.

    In order to have these lockdep splats show up earlier,
    preferrably during system initialization, prime the
    struct request_queue::io_lockdep_map as GFP_KERNEL reclaim-
    tainted. This will instead lead to lockdep splats looking similar
    to [2], but without the need for reclaim + writeback
    happening.

    [1]:
    [  189.762244] ======================================================
    [  189.762432] WARNING: possible circular locking dependency detected
    [  189.762441] 6.14.0-rc6-xe+ #6 Tainted: G     U
    [  189.762450] ------------------------------------------------------
    [  189.762459] kswapd0/119 is trying to acquire lock:
    [  189.762467] ffff888110ceb710 (&q->q_usage_counter(io)#26){++++}-{0:0}, at: __submit_bio+0x76/0x230
    [  189.762485]
                   but task is already holding lock:
    [  189.762494] ffffffff834c97c0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xbe/0xb00
    [  189.762507]
                   which lock already depends on the new lock.

    [  189.762519]
                   the existing dependency chain (in reverse order) is:
    [  189.762529]
                   -> #2 (fs_reclaim){+.+.}-{0:0}:
    [  189.762540]        fs_reclaim_acquire+0xc5/0x100
    [  189.762548]        kmem_cache_alloc_lru_noprof+0x4a/0x480
    [  189.762558]        alloc_inode+0xaa/0xe0
    [  189.762566]        iget_locked+0x157/0x330
    [  189.762573]        kernfs_get_inode+0x1b/0x110
    [  189.762582]        kernfs_get_tree+0x1b0/0x2e0
    [  189.762590]        sysfs_get_tree+0x1f/0x60
    [  189.762597]        vfs_get_tree+0x2a/0xf0
    [  189.762605]        path_mount+0x4cd/0xc00
    [  189.762613]        __x64_sys_mount+0x119/0x150
    [  189.762621]        x64_sys_call+0x14f2/0x2310
    [  189.762630]        do_syscall_64+0x91/0x180
    [  189.762637]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [  189.762647]
                   -> #1 (&root->kernfs_rwsem){++++}-{3:3}:
    [  189.762659]        down_write+0x3e/0xf0
    [  189.762667]        kernfs_remove+0x32/0x60
    [  189.762676]        sysfs_remove_dir+0x4f/0x60
    [  189.762685]        __kobject_del+0x33/0xa0
    [  189.762709]        kobject_del+0x13/0x30
    [  189.762716]        elv_unregister_queue+0x52/0x80
    [  189.762725]        elevator_switch+0x68/0x360
    [  189.762733]        elv_iosched_store+0x14b/0x1b0
    [  189.762756]        queue_attr_store+0x181/0x1e0
    [  189.762765]        sysfs_kf_write+0x49/0x80
    [  189.762773]        kernfs_fop_write_iter+0x17d/0x250
    [  189.762781]        vfs_write+0x281/0x540
    [  189.762790]        ksys_write+0x72/0xf0
    [  189.762798]        __x64_sys_write+0x19/0x30
    [  189.762807]        x64_sys_call+0x2a3/0x2310
    [  189.762815]        do_syscall_64+0x91/0x180
    [  189.762823]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [  189.762833]
                   -> #0 (&q->q_usage_counter(io)#26){++++}-{0:0}:
    [  189.762845]        __lock_acquire+0x1525/0x2760
    [  189.762854]        lock_acquire+0xca/0x310
    [  189.762861]        blk_mq_submit_bio+0x8a2/0xba0
    [  189.762870]        __submit_bio+0x76/0x230
    [  189.762878]        submit_bio_noacct_nocheck+0x323/0x430
    [  189.762888]        submit_bio_noacct+0x2cc/0x620
    [  189.762896]        submit_bio+0x38/0x110
    [  189.762904]        __swap_writepage+0xf5/0x380
    [  189.762912]        swap_writepage+0x3c7/0x600
    [  189.762920]        shmem_writepage+0x3da/0x4f0
    [  189.762929]        pageout+0x13f/0x310
    [  189.762937]        shrink_folio_list+0x61c/0xf60
    [  189.763261]        evict_folios+0x378/0xcd0
    [  189.763584]        try_to_shrink_lruvec+0x1b0/0x360
    [  189.763946]        shrink_one+0x10e/0x200
    [  189.764266]        shrink_node+0xc02/0x1490
    [  189.764586]        balance_pgdat+0x563/0xb00
    [  189.764934]        kswapd+0x1e8/0x430
    [  189.765249]        kthread+0x10b/0x260
    [  189.765559]        ret_from_fork+0x44/0x70
    [  189.765889]        ret_from_fork_asm+0x1a/0x30
    [  189.766198]
                   other info that might help us debug this:

    [  189.767089] Chain exists of:
                     &q->q_usage_counter(io)#26 --> &root->kernfs_rwsem --> fs_reclaim

    [  189.767971]  Possible unsafe locking scenario:

    [  189.768555]        CPU0                    CPU1
    [  189.768849]        ----                    ----
    [  189.769136]   lock(fs_reclaim);
    [  189.769421]                                lock(&root->kernfs_rwsem);
    [  189.769714]                                lock(fs_reclaim);
    [  189.770016]   rlock(&q->q_usage_counter(io)#26);
    [  189.770305]
                    *** DEADLOCK ***

    [  189.771167] 1 lock held by kswapd0/119:
    [  189.771453]  #0: ffffffff834c97c0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xbe/0xb00
    [  189.771770]
                   stack backtrace:
    [  189.772351] CPU: 4 UID: 0 PID: 119 Comm: kswapd0 Tainted: G     U             6.14.0-rc6-xe+ #6
    [  189.772353] Tainted: [U]=USER
    [  189.772354] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023
    [  189.772354] Call Trace:
    [  189.772355]  <TASK>
    [  189.772356]  dump_stack_lvl+0x6e/0xa0
    [  189.772359]  dump_stack+0x10/0x18
    [  189.772360]  print_circular_bug.cold+0x17a/0x1b7
    [  189.772363]  check_noncircular+0x13a/0x150
    [  189.772365]  ? __pfx_stack_trace_consume_entry+0x10/0x10
    [  189.772368]  __lock_acquire+0x1525/0x2760
    [  189.772368]  ? ret_from_fork_asm+0x1a/0x30
    [  189.772371]  lock_acquire+0xca/0x310
    [  189.772372]  ? __submit_bio+0x76/0x230
    [  189.772375]  ? lock_release+0xd5/0x2c0
    [  189.772376]  blk_mq_submit_bio+0x8a2/0xba0
    [  189.772378]  ? __submit_bio+0x76/0x230
    [  189.772380]  __submit_bio+0x76/0x230
    [  189.772382]  ? trace_hardirqs_on+0x1e/0xe0
    [  189.772384]  submit_bio_noacct_nocheck+0x323/0x430
    [  189.772386]  ? submit_bio_noacct_nocheck+0x323/0x430
    [  189.772387]  ? __might_sleep+0x58/0xa0
    [  189.772390]  submit_bio_noacct+0x2cc/0x620
    [  189.772391]  ? count_memcg_events+0x68/0x90
    [  189.772393]  submit_bio+0x38/0x110
    [  189.772395]  __swap_writepage+0xf5/0x380
    [  189.772396]  swap_writepage+0x3c7/0x600
    [  189.772397]  shmem_writepage+0x3da/0x4f0
    [  189.772401]  pageout+0x13f/0x310
    [  189.772406]  shrink_folio_list+0x61c/0xf60
    [  189.772409]  ? isolate_folios+0xe80/0x16b0
    [  189.772410]  ? mark_held_locks+0x46/0x90
    [  189.772412]  evict_folios+0x378/0xcd0
    [  189.772414]  ? evict_folios+0x34a/0xcd0
    [  189.772415]  ? lock_is_held_type+0xa3/0x130
    [  189.772417]  try_to_shrink_lruvec+0x1b0/0x360
    [  189.772420]  shrink_one+0x10e/0x200
    [  189.772421]  shrink_node+0xc02/0x1490
    [  189.772423]  ? shrink_node+0xa08/0x1490
    [  189.772424]  ? shrink_node+0xbd8/0x1490
    [  189.772425]  ? mem_cgroup_iter+0x366/0x480
    [  189.772427]  balance_pgdat+0x563/0xb00
    [  189.772428]  ? balance_pgdat+0x563/0xb00
    [  189.772430]  ? trace_hardirqs_on+0x1e/0xe0
    [  189.772431]  ? finish_task_switch.isra.0+0xcb/0x330
    [  189.772433]  ? __switch_to_asm+0x33/0x70
    [  189.772437]  kswapd+0x1e8/0x430
    [  189.772438]  ? __pfx_autoremove_wake_function+0x10/0x10
    [  189.772440]  ? __pfx_kswapd+0x10/0x10
    [  189.772441]  kthread+0x10b/0x260
    [  189.772443]  ? __pfx_kthread+0x10/0x10
    [  189.772444]  ret_from_fork+0x44/0x70
    [  189.772446]  ? __pfx_kthread+0x10/0x10
    [  189.772447]  ret_from_fork_asm+0x1a/0x30
    [  189.772450]  </TASK>

    [2]:
    [    8.760253] ======================================================
    [    8.760254] WARNING: possible circular locking dependency detected
    [    8.760255] 6.14.0-rc6-xe+ #7 Tainted: G     U
    [    8.760256] ------------------------------------------------------
    [    8.760257] (udev-worker)/674 is trying to acquire lock:
    [    8.760259] ffff888100e39148 (&root->kernfs_rwsem){++++}-{3:3}, at: kernfs_remove+0x32/0x60
    [    8.760265]
                   but task is already holding lock:
    [    8.760266] ffff888110dc7680 (&q->q_usage_counter(io)#27){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x12/0x30
    [    8.760272]
                   which lock already depends on the new lock.

    [    8.760272]
                   the existing dependency chain (in reverse order) is:
    [    8.760273]
                   -> #2 (&q->q_usage_counter(io)#27){++++}-{0:0}:
    [    8.760276]        blk_alloc_queue+0x30a/0x350
    [    8.760279]        blk_mq_alloc_queue+0x6b/0xe0
    [    8.760281]        scsi_alloc_sdev+0x276/0x3c0
    [    8.760284]        scsi_probe_and_add_lun+0x22a/0x440
    [    8.760286]        __scsi_scan_target+0x109/0x230
    [    8.760288]        scsi_scan_channel+0x65/0xc0
    [    8.760290]        scsi_scan_host_selected+0xff/0x140
    [    8.760292]        do_scsi_scan_host+0xa7/0xc0
    [    8.760293]        do_scan_async+0x1c/0x160
    [    8.760295]        async_run_entry_fn+0x32/0x150
    [    8.760299]        process_one_work+0x224/0x5f0
    [    8.760302]        worker_thread+0x1d4/0x3e0
    [    8.760304]        kthread+0x10b/0x260
    [    8.760306]        ret_from_fork+0x44/0x70
    [    8.760309]        ret_from_fork_asm+0x1a/0x30
    [    8.760312]
                   -> #1 (fs_reclaim){+.+.}-{0:0}:
    [    8.760315]        fs_reclaim_acquire+0xc5/0x100
    [    8.760317]        kmem_cache_alloc_lru_noprof+0x4a/0x480
    [    8.760319]        alloc_inode+0xaa/0xe0
    [    8.760322]        iget_locked+0x157/0x330
    [    8.760323]        kernfs_get_inode+0x1b/0x110
    [    8.760325]        kernfs_get_tree+0x1b0/0x2e0
    [    8.760327]        sysfs_get_tree+0x1f/0x60
    [    8.760329]        vfs_get_tree+0x2a/0xf0
    [    8.760332]        path_mount+0x4cd/0xc00
    [    8.760334]        __x64_sys_mount+0x119/0x150
    [    8.760336]        x64_sys_call+0x14f2/0x2310
    [    8.760338]        do_syscall_64+0x91/0x180
    [    8.760340]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [    8.760342]
                   -> #0 (&root->kernfs_rwsem){++++}-{3:3}:
    [    8.760345]        __lock_acquire+0x1525/0x2760
    [    8.760347]        lock_acquire+0xca/0x310
    [    8.760348]        down_write+0x3e/0xf0
    [    8.760350]        kernfs_remove+0x32/0x60
    [    8.760351]        sysfs_remove_dir+0x4f/0x60
    [    8.760353]        __kobject_del+0x33/0xa0
    [    8.760355]        kobject_del+0x13/0x30
    [    8.760356]        elv_unregister_queue+0x52/0x80
    [    8.760358]        elevator_switch+0x68/0x360
    [    8.760360]        elv_iosched_store+0x14b/0x1b0
    [    8.760362]        queue_attr_store+0x181/0x1e0
    [    8.760364]        sysfs_kf_write+0x49/0x80
    [    8.760366]        kernfs_fop_write_iter+0x17d/0x250
    [    8.760367]        vfs_write+0x281/0x540
    [    8.760370]        ksys_write+0x72/0xf0
    [    8.760372]        __x64_sys_write+0x19/0x30
    [    8.760374]        x64_sys_call+0x2a3/0x2310
    [    8.760376]        do_syscall_64+0x91/0x180
    [    8.760377]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [    8.760380]
                   other info that might help us debug this:

    [    8.760380] Chain exists of:
                     &root->kernfs_rwsem --> fs_reclaim --> &q->q_usage_counter(io)#27

    [    8.760384]  Possible unsafe locking scenario:

    [    8.760384]        CPU0                    CPU1
    [    8.760385]        ----                    ----
    [    8.760385]   lock(&q->q_usage_counter(io)#27);
    [    8.760387]                                lock(fs_reclaim);
    [    8.760388]                                lock(&q->q_usage_counter(io)#27);
    [    8.760390]   lock(&root->kernfs_rwsem);
    [    8.760391]
                    *** DEADLOCK ***

    [    8.760391] 6 locks held by (udev-worker)/674:
    [    8.760392]  #0: ffff8881209ac420 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x72/0xf0
    [    8.760398]  #1: ffff88810c80f488 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x136/0x250
    [    8.760402]  #2: ffff888125d1d330 (kn->active#101){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x13f/0x250
    [    8.760406]  #3: ffff888110dc7bb0 (&q->sysfs_lock){+.+.}-{3:3}, at: queue_attr_store+0x148/0x1e0
    [    8.760411]  #4: ffff888110dc7680 (&q->q_usage_counter(io)#27){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x12/0x30
    [    8.760416]  #5: ffff888110dc76b8 (&q->q_usage_counter(queue)#27){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x12/0x30
    [    8.760421]
                   stack backtrace:
    [    8.760422] CPU: 7 UID: 0 PID: 674 Comm: (udev-worker) Tainted: G     U             6.14.0-rc6-xe+ #7
    [    8.760424] Tainted: [U]=USER
    [    8.760425] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023
    [    8.760426] Call Trace:
    [    8.760427]  <TASK>
    [    8.760428]  dump_stack_lvl+0x6e/0xa0
    [    8.760431]  dump_stack+0x10/0x18
    [    8.760433]  print_circular_bug.cold+0x17a/0x1b7
    [    8.760437]  check_noncircular+0x13a/0x150
    [    8.760441]  ? save_trace+0x54/0x360
    [    8.760445]  __lock_acquire+0x1525/0x2760
    [    8.760446]  ? irqentry_exit+0x3a/0xb0
    [    8.760448]  ? sysvec_apic_timer_interrupt+0x57/0xc0
    [    8.760452]  lock_acquire+0xca/0x310
    [    8.760453]  ? kernfs_remove+0x32/0x60
    [    8.760457]  down_write+0x3e/0xf0
    [    8.760459]  ? kernfs_remove+0x32/0x60
    [    8.760460]  kernfs_remove+0x32/0x60
    [    8.760462]  sysfs_remove_dir+0x4f/0x60
    [    8.760464]  __kobject_del+0x33/0xa0
    [    8.760466]  kobject_del+0x13/0x30
    [    8.760467]  elv_unregister_queue+0x52/0x80
    [    8.760470]  elevator_switch+0x68/0x360
    [    8.760472]  elv_iosched_store+0x14b/0x1b0
    [    8.760475]  queue_attr_store+0x181/0x1e0
    [    8.760479]  ? lock_acquire+0xca/0x310
    [    8.760480]  ? kernfs_fop_write_iter+0x13f/0x250
    [    8.760482]  ? lock_is_held_type+0xa3/0x130
    [    8.760485]  sysfs_kf_write+0x49/0x80
    [    8.760487]  kernfs_fop_write_iter+0x17d/0x250
    [    8.760489]  vfs_write+0x281/0x540
    [    8.760494]  ksys_write+0x72/0xf0
    [    8.760497]  __x64_sys_write+0x19/0x30
    [    8.760499]  x64_sys_call+0x2a3/0x2310
    [    8.760502]  do_syscall_64+0x91/0x180
    [    8.760504]  ? trace_hardirqs_off+0x5d/0xe0
    [    8.760506]  ? handle_softirqs+0x479/0x4d0
    [    8.760508]  ? hrtimer_interrupt+0x13f/0x280
    [    8.760511]  ? irqentry_exit_to_user_mode+0x8b/0x260
    [    8.760513]  ? clear_bhb_loop+0x15/0x70
    [    8.760515]  ? clear_bhb_loop+0x15/0x70
    [    8.760516]  ? clear_bhb_loop+0x15/0x70
    [    8.760518]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [    8.760520] RIP: 0033:0x7aa3bf2f5504
    [    8.760522] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 8b 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
    [    8.760523] RSP: 002b:00007ffc1e3697d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
    [    8.760526] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007aa3bf2f5504
    [    8.760527] RDX: 0000000000000003 RSI: 00007ffc1e369ae0 RDI: 000000000000001c
    [    8.760528] RBP: 00007ffc1e369800 R08: 00007aa3bf3f51c8 R09: 00007ffc1e3698b0
    [    8.760528] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
    [    8.760529] R13: 00007ffc1e369ae0 R14: 0000613ccf21f2f0 R15: 00007aa3bf3f4e80
    [    8.760533]  </TASK>

    v2:
    - Update a code comment to increase readability (Ming Lei).

    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: linux-block@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
    Reviewed-by: Ming Lei <ming.lei@redhat.com>
    Link: https://lore.kernel.org/r/20250318095548.5187-1-thomas.hellstrom@linux.intel.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 24, 2025
JIRA: https://issues.redhat.com/browse/RHEL-112997

commit 9730763
Author: Nilay Shroff <nilay@linux.ibm.com>
Date:   Wed Mar 19 16:23:46 2025 +0530

    block: correct locking order for protecting blk-wbt parameters

    The commit '245618f8e45f ("block: protect wbt_lat_usec using q->
    elevator_lock")' introduced q->elevator_lock to protect updates
    to blk-wbt parameters when writing to the sysfs attribute wbt_
    lat_usec and the cgroup attribute io.cost.qos.  However, both
    these attributes also acquire q->rq_qos_mutex, leading to the
    following lockdep warning:

    ======================================================
    WARNING: possible circular locking dependency detected
    6.14.0-rc5+ #138 Not tainted
    ------------------------------------------------------
    bash/5902 is trying to acquire lock:
    c000000085d495a0 (&q->rq_qos_mutex){+.+.}-{4:4}, at: wbt_init+0x164/0x238

    but task is already holding lock:
    c000000085d498c8 (&q->elevator_lock){+.+.}-{4:4}, at: queue_wb_lat_store+0xb0/0x20c

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&q->elevator_lock){+.+.}-{4:4}:
            __mutex_lock+0xf0/0xa58
            ioc_qos_write+0x16c/0x85c
            cgroup_file_write+0xc4/0x32c
            kernfs_fop_write_iter+0x1b8/0x29c
            vfs_write+0x410/0x584
            ksys_write+0x84/0x140
            system_call_exception+0x134/0x360
            system_call_vectored_common+0x15c/0x2ec

    -> #0 (&q->rq_qos_mutex){+.+.}-{4:4}:
            __lock_acquire+0x1b6c/0x2ae0
            lock_acquire+0x140/0x430
            __mutex_lock+0xf0/0xa58
            wbt_init+0x164/0x238
            queue_wb_lat_store+0x1dc/0x20c
            queue_attr_store+0x12c/0x164
            sysfs_kf_write+0x6c/0xb0
            kernfs_fop_write_iter+0x1b8/0x29c
            vfs_write+0x410/0x584
            ksys_write+0x84/0x140
            system_call_exception+0x134/0x360
            system_call_vectored_common+0x15c/0x2ec

    other info that might help us debug this:

        Possible unsafe locking scenario:

            CPU0                    CPU1
            ----                    ----
        lock(&q->elevator_lock);
                                    lock(&q->rq_qos_mutex);
                                    lock(&q->elevator_lock);
        lock(&q->rq_qos_mutex);

        *** DEADLOCK ***

    6 locks held by bash/5902:
        #0: c000000051122400 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x84/0x140
        #1: c00000007383f088 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x174/0x29c
        #2: c000000008550428 (kn->active#182){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x180/0x29c
        #3: c000000085d493a8 (&q->q_usage_counter(io)#5){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x28/0x40
        #4: c000000085d493e0 (&q->q_usage_counter(queue)#5){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x28/0x40
        #5: c000000085d498c8 (&q->elevator_lock){+.+.}-{4:4}, at: queue_wb_lat_store+0xb0/0x20c

    stack backtrace:
    CPU: 17 UID: 0 PID: 5902 Comm: bash Kdump: loaded Not tainted 6.14.0-rc5+ #138
    Hardware name: IBM,9043-MRX POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_028) hv:phyp pSeries
    Call Trace:
    [c0000000721ef590] [c00000000118f8a8] dump_stack_lvl+0x108/0x18c (unreliable)
    [c0000000721ef5c0] [c00000000022563c] print_circular_bug+0x448/0x604
    [c0000000721ef670] [c000000000225a44] check_noncircular+0x24c/0x26c
    [c0000000721ef740] [c00000000022bf28] __lock_acquire+0x1b6c/0x2ae0
    [c0000000721ef870] [c000000000229240] lock_acquire+0x140/0x430
    [c0000000721ef970] [c0000000011cfbec] __mutex_lock+0xf0/0xa58
    [c0000000721efaa0] [c00000000096c46c] wbt_init+0x164/0x238
    [c0000000721efaf0] [c0000000008f8cd8] queue_wb_lat_store+0x1dc/0x20c
    [c0000000721efb50] [c0000000008f8fa0] queue_attr_store+0x12c/0x164
    [c0000000721efc60] [c0000000007c11cc] sysfs_kf_write+0x6c/0xb0
    [c0000000721efca0] [c0000000007bfa4c] kernfs_fop_write_iter+0x1b8/0x29c
    [c0000000721efcf0] [c0000000006a281c] vfs_write+0x410/0x584
    [c0000000721efdc0] [c0000000006a2cc8] ksys_write+0x84/0x140
    [c0000000721efe10] [c000000000031b64] system_call_exception+0x134/0x360
    [c0000000721efe50] [c00000000000cedc] system_call_vectored_common+0x15c/0x2ec

    >From the above log it's apparent that method which writes to sysfs attr
    wbt_lat_usec acquires q->elevator_lock first, and then acquires q->rq_
    qos_mutex. However the another method which writes to io.cost.qos,
    acquires q->rq_qos_mutex first, and then acquires q->rq_qos_mutex. So
    this could potentially cause the deadlock.

    A closer look at ioc_qos_write shows that correcting the lock order is
    non-trivial because q->rq_qos_mutex is acquired in blkg_conf_open_bdev
    and released in blkg_conf_exit. The function blkg_conf_open_bdev is
    responsible for parsing user input and finding the corresponding block
    device (bdev) from the user provided major:minor number.

    Since we do not know the bdev until blkg_conf_open_bdev completes, we
    cannot simply move q->elevator_lock acquisition before blkg_conf_open_
    bdev. So to address this, we intoduce new helpers blkg_conf_open_bdev_
    frozen and blkg_conf_exit_frozen which are just wrappers around blkg_
    conf_open_bdev and blkg_conf_exit respectively. The helper blkg_conf_
    open_bdev_frozen is similar to blkg_conf_open_bdev, but additionally
    freezes the queue, acquires q->elevator_lock and ensures the correct
    locking order is followed between q->elevator_lock and q->rq_qos_mutex.
    Similarly another helper blkg_conf_exit_frozen in addition to unfreezing
    the queue ensures that we release the locks in correct order.

    By using these helpers, now we maintain the same locking order in all
    code paths where we update blk-wbt parameters.

    Fixes: 245618f ("block: protect wbt_lat_usec using q->elevator_lock")
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Closes: https://lore.kernel.org/oe-lkp/202503171650.cc082b66-lkp@intel.com
    Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
    Link: https://lore.kernel.org/r/20250319105518.468941-3-nilay@linux.ibm.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 24, 2025
JIRA: https://issues.redhat.com/browse/RHEL-112997

commit 9dc7a88
Author: Ming Lei <ming.lei@redhat.com>
Date:   Mon May 5 22:18:00 2025 +0800

    block: move hctx debugfs/sysfs registering out of freezing queue

    Move hctx debugfs/sysfs register out of freezing queue in
    __blk_mq_update_nr_hw_queues(), so that the following lockdep dependency
    can be killed:

            #2 (&q->q_usage_counter(io)#16){++++}-{0:0}:
            #1 (fs_reclaim){+.+.}-{0:0}:
            #0 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}: //debugfs

    And registering/un-registering hctx debugfs/sysfs does not require queue to
    be frozen:

    - hctx sysfs attributes show() are drained when removing kobject, and
      there isn't store() implementation for hctx sysfs attributes

    - debugfs entry read() is drained too when removing debugfs directory,
      and there isn't write() implementation for hctx debugfs too

    - so it is safe to register/unregister hctx sysfs/debugfs without
      freezing queue because the cod paths changes nothing, and we just
      need to keep hctx live

    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Link: https://lore.kernel.org/r/20250505141805.2751237-23-ming.lei@redhat.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 24, 2025
…xit()

JIRA: https://issues.redhat.com/browse/RHEL-112997

commit 78c2713
Author: Ming Lei <ming.lei@redhat.com>
Date:   Mon May 5 22:18:03 2025 +0800

    block: move wbt_enable_default() out of queue freezing from sched ->exit()

    scheduler's ->exit() is called with queue frozen and elevator lock is held, and
    wbt_enable_default() can't be called with queue frozen, otherwise the
    following lockdep warning is triggered:

            #6 (&q->rq_qos_mutex){+.+.}-{4:4}:
            #5 (&eq->sysfs_lock){+.+.}-{4:4}:
            #4 (&q->elevator_lock){+.+.}-{4:4}:
            #3 (&q->q_usage_counter(io)#3){++++}-{0:0}:
            #2 (fs_reclaim){+.+.}-{0:0}:
            #1 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}:
            #0 (&q->debugfs_mutex){+.+.}-{4:4}:

    Fix the issue by moving wbt_enable_default() out of bfq's exit(), and
    call it from elevator_change_done().

    Meantime add disk->rqos_state_mutex for covering wbt state change, which
    matches the purpose more than ->elevator_lock.

    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20250505141805.2751237-26-ming.lei@redhat.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 24, 2025
JIRA: https://issues.redhat.com/browse/RHEL-112997

commit 8b428f4
Author: Ming Lei <ming.lei@redhat.com>
Date:   Wed Jul 9 19:17:44 2025 +0800

    nbd: fix lockdep deadlock warning

    nbd grabs device lock nbd->config_lock for updating nr_hw_queues, this
    ways cause the following lock dependency:

    -> #2 (&disk->open_mutex){+.+.}-{4:4}:
           lock_acquire kernel/locking/lockdep.c:5871 [inline]
           lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
           __mutex_lock_common kernel/locking/mutex.c:602 [inline]
           __mutex_lock+0x166/0x1292 kernel/locking/mutex.c:747
           mutex_lock_nested+0x14/0x1c kernel/locking/mutex.c:799
           __del_gendisk+0x132/0xac6 block/genhd.c:706
           del_gendisk+0xf6/0x19a block/genhd.c:819
           nbd_dev_remove+0x3c/0xf2 drivers/block/nbd.c:268
           nbd_dev_remove_work+0x1c/0x26 drivers/block/nbd.c:284
           process_one_work+0x96a/0x1f32 kernel/workqueue.c:3238
           process_scheduled_works kernel/workqueue.c:3321 [inline]
           worker_thread+0x5ce/0xde8 kernel/workqueue.c:3402
           kthread+0x39c/0x7d4 kernel/kthread.c:464
           ret_from_fork_kernel+0x2a/0xbb2 arch/riscv/kernel/process.c:214
           ret_from_fork_kernel_asm+0x16/0x18 arch/riscv/kernel/entry.S:327

    -> #1 (&set->update_nr_hwq_lock){++++}-{4:4}:
           lock_acquire kernel/locking/lockdep.c:5871 [inline]
           lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
           down_write+0x9c/0x19a kernel/locking/rwsem.c:1577
           blk_mq_update_nr_hw_queues+0x3e/0xb86 block/blk-mq.c:5041
           nbd_start_device+0x140/0xb2c drivers/block/nbd.c:1476
           nbd_genl_connect+0xae0/0x1b24 drivers/block/nbd.c:2201
           genl_family_rcv_msg_doit+0x206/0x2e6 net/netlink/genetlink.c:1115
           genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
           genl_rcv_msg+0x514/0x78e net/netlink/genetlink.c:1210
           netlink_rcv_skb+0x206/0x3be net/netlink/af_netlink.c:2534
           genl_rcv+0x36/0x4c net/netlink/genetlink.c:1219
           netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
           netlink_unicast+0x4f0/0x82c net/netlink/af_netlink.c:1339
           netlink_sendmsg+0x85e/0xdd6 net/netlink/af_netlink.c:1883
           sock_sendmsg_nosec net/socket.c:712 [inline]
           __sock_sendmsg+0xcc/0x160 net/socket.c:727
           ____sys_sendmsg+0x63e/0x79c net/socket.c:2566
           ___sys_sendmsg+0x144/0x1e6 net/socket.c:2620
           __sys_sendmsg+0x188/0x246 net/socket.c:2652
           __do_sys_sendmsg net/socket.c:2657 [inline]
           __se_sys_sendmsg net/socket.c:2655 [inline]
           __riscv_sys_sendmsg+0x70/0xa2 net/socket.c:2655
           syscall_handler+0x94/0x118 arch/riscv/include/asm/syscall.h:112
           do_trap_ecall_u+0x396/0x530 arch/riscv/kernel/traps.c:341
           handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197

    -> #0 (&nbd->config_lock){+.+.}-{4:4}:
           check_noncircular+0x132/0x146 kernel/locking/lockdep.c:2178
           check_prev_add kernel/locking/lockdep.c:3168 [inline]
           check_prevs_add kernel/locking/lockdep.c:3287 [inline]
           validate_chain kernel/locking/lockdep.c:3911 [inline]
           __lock_acquire+0x12b2/0x24ea kernel/locking/lockdep.c:5240
           lock_acquire kernel/locking/lockdep.c:5871 [inline]
           lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
           __mutex_lock_common kernel/locking/mutex.c:602 [inline]
           __mutex_lock+0x166/0x1292 kernel/locking/mutex.c:747
           mutex_lock_nested+0x14/0x1c kernel/locking/mutex.c:799
           refcount_dec_and_mutex_lock+0x60/0xd8 lib/refcount.c:118
           nbd_config_put+0x3a/0x610 drivers/block/nbd.c:1423
           nbd_release+0x94/0x15c drivers/block/nbd.c:1735
           blkdev_put_whole+0xac/0xee block/bdev.c:721
           bdev_release+0x3fe/0x600 block/bdev.c:1144
           blkdev_release+0x1a/0x26 block/fops.c:684
           __fput+0x382/0xa8c fs/file_table.c:465
           ____fput+0x1c/0x26 fs/file_table.c:493
           task_work_run+0x16a/0x25e kernel/task_work.c:227
           resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
           exit_to_user_mode_loop+0x118/0x134 kernel/entry/common.c:114
           exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
           syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
           syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
           do_trap_ecall_u+0x3f0/0x530 arch/riscv/kernel/traps.c:355
           handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197

    Also it isn't necessary to require nbd->config_lock, because
    blk_mq_update_nr_hw_queues() does grab tagset lock for sync everything.

    Fixes the issue by releasing ->config_lock & retry in case of concurrent
    updating nr_hw_queues.

    Fixes: 98e68f6 ("block: prevent adding/deleting disk during updating nr_hw_queues")
    Reported-by: syzbot+2bcecf3c38cb3e8fdc8d@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/all/6855034f.a00a0220.137b3.0031.GAE@google.com
    Reviewed-by: Yu Kuai <yukuai3@huawei.com>
    Cc: Nilay Shroff <nilay@linux.ibm.com>
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
    Link: https://lore.kernel.org/r/20250709111744.2353050-1-ming.lei@redhat.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 24, 2025
JIRA: https://issues.redhat.com/browse/RHEL-115630
Upstream Status: commit 17ce3e5

commit 17ce3e5
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date:   Tue Jul 22 22:40:37 2025 +0000

    bpf: Disable migration in nf_hook_run_bpf().

    syzbot reported that the netfilter bpf prog can be called without
    migration disabled in xmit path.

    Then the assertion in __bpf_prog_run() fails, triggering the splat
    below. [0]

    Let's use bpf_prog_run_pin_on_cpu() in nf_hook_run_bpf().

    [0]:
    BUG: assuming non migratable context at ./include/linux/filter.h:703
    in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 5829, name: sshd-session
    3 locks held by sshd-session/5829:
     #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1667 [inline]
     #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x20/0x50 net/ipv4/tcp.c:1395
     #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
     #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
     #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x69/0x26c0 net/ipv4/ip_output.c:470
     #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
     #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
     #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: nf_hook+0xb2/0x680 include/linux/netfilter.h:241
    CPU: 0 UID: 0 PID: 5829 Comm: sshd-session Not tainted 6.16.0-rc6-syzkaller-00002-g155a3c003e55 #0 PREEMPT(full)
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
     __cant_migrate kernel/sched/core.c:8860 [inline]
     __cant_migrate+0x1c7/0x250 kernel/sched/core.c:8834
     __bpf_prog_run include/linux/filter.h:703 [inline]
     bpf_prog_run include/linux/filter.h:725 [inline]
     nf_hook_run_bpf+0x83/0x1e0 net/netfilter/nf_bpf_link.c:20
     nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline]
     nf_hook_slow+0xbb/0x200 net/netfilter/core.c:623
     nf_hook+0x370/0x680 include/linux/netfilter.h:272
     NF_HOOK_COND include/linux/netfilter.h:305 [inline]
     ip_output+0x1bc/0x2a0 net/ipv4/ip_output.c:433
     dst_output include/net/dst.h:459 [inline]
     ip_local_out net/ipv4/ip_output.c:129 [inline]
     __ip_queue_xmit+0x1d7d/0x26c0 net/ipv4/ip_output.c:527
     __tcp_transmit_skb+0x2686/0x3e90 net/ipv4/tcp_output.c:1479
     tcp_transmit_skb net/ipv4/tcp_output.c:1497 [inline]
     tcp_write_xmit+0x1274/0x84e0 net/ipv4/tcp_output.c:2838
     __tcp_push_pending_frames+0xaf/0x390 net/ipv4/tcp_output.c:3021
     tcp_push+0x225/0x700 net/ipv4/tcp.c:759
     tcp_sendmsg_locked+0x1870/0x42b0 net/ipv4/tcp.c:1359
     tcp_sendmsg+0x2e/0x50 net/ipv4/tcp.c:1396
     inet_sendmsg+0xb9/0x140 net/ipv4/af_inet.c:851
     sock_sendmsg_nosec net/socket.c:712 [inline]
     __sock_sendmsg net/socket.c:727 [inline]
     sock_write_iter+0x4aa/0x5b0 net/socket.c:1131
     new_sync_write fs/read_write.c:593 [inline]
     vfs_write+0x6c7/0x1150 fs/read_write.c:686
     ksys_write+0x1f8/0x250 fs/read_write.c:738
     do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
     do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7fe7d365d407
    Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
    RSP:

    Fixes: fd9c663 ("bpf: minimal support for programs hooked into netfilter framework")
    Reported-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/all/6879466d.a00a0220.3af5df.0022.GAE@google.com/
    Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Tested-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com
    Acked-by: Florian Westphal <fw@strlen.de>
    Link: https://patch.msgid.link/20250722224041.112292-1-kuniyu@google.com

Signed-off-by: Florian Westphal <fwestpha@redhat.com>
roxanan1996 added a commit that referenced this pull request Oct 24, 2025
jira VULN-136700
cve CVE-2025-38392
commit-author Ahmed Zaki <ahmed.zaki@intel.com>
commit b2beb5b

With VIRTCHNL2_CAP_MACFILTER enabled, the following warning is generated
on module load:

[  324.701677] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578
[  324.701684] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1582, name: NetworkManager
[  324.701689] preempt_count: 201, expected: 0
[  324.701693] RCU nest depth: 0, expected: 0
[  324.701697] 2 locks held by NetworkManager/1582:
[  324.701702]  #0: ffffffff9f7be770 (rtnl_mutex){....}-{3:3}, at: rtnl_newlink+0x791/0x21e0
[  324.701730]  #1: ff1100216c380368 (_xmit_ETHER){....}-{2:2}, at: __dev_open+0x3f0/0x870
[  324.701749] Preemption disabled at:
[  324.701752] [<ffffffff9cd23b9d>] __dev_open+0x3dd/0x870
[  324.701765] CPU: 30 UID: 0 PID: 1582 Comm: NetworkManager Not tainted 6.15.0-rc5+ #2 PREEMPT(voluntary)
[  324.701771] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022
[  324.701774] Call Trace:
[  324.701777]  <TASK>
[  324.701779]  dump_stack_lvl+0x5d/0x80
[  324.701788]  ? __dev_open+0x3dd/0x870
[  324.701793]  __might_resched.cold+0x1ef/0x23d
<..>
[  324.701818]  __mutex_lock+0x113/0x1b80
<..>
[  324.701917]  idpf_ctlq_clean_sq+0xad/0x4b0 [idpf]
[  324.701935]  ? kasan_save_track+0x14/0x30
[  324.701941]  idpf_mb_clean+0x143/0x380 [idpf]
<..>
[  324.701991]  idpf_send_mb_msg+0x111/0x720 [idpf]
[  324.702009]  idpf_vc_xn_exec+0x4cc/0x990 [idpf]
[  324.702021]  ? rcu_is_watching+0x12/0xc0
[  324.702035]  idpf_add_del_mac_filters+0x3ed/0xb50 [idpf]
<..>
[  324.702122]  __hw_addr_sync_dev+0x1cf/0x300
[  324.702126]  ? find_held_lock+0x32/0x90
[  324.702134]  idpf_set_rx_mode+0x317/0x390 [idpf]
[  324.702152]  __dev_open+0x3f8/0x870
[  324.702159]  ? __pfx___dev_open+0x10/0x10
[  324.702174]  __dev_change_flags+0x443/0x650
<..>
[  324.702208]  netif_change_flags+0x80/0x160
[  324.702218]  do_setlink.isra.0+0x16a0/0x3960
<..>
[  324.702349]  rtnl_newlink+0x12fd/0x21e0

The sequence is as follows:
	rtnl_newlink()->
	__dev_change_flags()->
	__dev_open()->
	dev_set_rx_mode() - >  # disables BH and grabs "dev->addr_list_lock"
	idpf_set_rx_mode() ->  # proceed only if VIRTCHNL2_CAP_MACFILTER is ON
	__dev_uc_sync() ->
	idpf_add_mac_filter ->
	idpf_add_del_mac_filters ->
	idpf_send_mb_msg() ->
	idpf_mb_clean() ->
	idpf_ctlq_clean_sq()   # mutex_lock(cq_lock)

Fix by converting cq_lock to a spinlock. All operations under the new
lock are safe except freeing the DMA memory, which may use vunmap(). Fix
by requesting a contiguous physical memory for the DMA mapping.

Fixes: a251eee ("idpf: add SRIOV support and other ndo_ops")
	Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
	Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
	Reviewed-by: Simon Horman <horms@kernel.org>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit b2beb5b)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
github-actions bot pushed a commit that referenced this pull request Oct 26, 2025
The original code causes a circular locking dependency found by lockdep.

======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 Tainted: G S   U
------------------------------------------------------
xe_fault_inject/5091 is trying to acquire lock:
ffff888156815688 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}, at: __flush_work+0x25d/0x660

but task is already holding lock:

ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&devcd->mutex){+.+.}-{3:3}:
       mutex_lock_nested+0x4e/0xc0
       devcd_data_write+0x27/0x90
       sysfs_kf_bin_write+0x80/0xf0
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (kn->active#236){++++}-{0:0}:
       kernfs_drain+0x1e2/0x200
       __kernfs_remove+0xae/0x400
       kernfs_remove_by_name_ns+0x5d/0xc0
       remove_files+0x54/0x70
       sysfs_remove_group+0x3d/0xa0
       sysfs_remove_groups+0x2e/0x60
       device_remove_attrs+0xc7/0x100
       device_del+0x15d/0x3b0
       devcd_del+0x19/0x30
       process_one_work+0x22b/0x6f0
       worker_thread+0x1e8/0x3d0
       kthread+0x11c/0x250
       ret_from_fork+0x26c/0x2e0
       ret_from_fork_asm+0x1a/0x30
-> #0 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}:
       __lock_acquire+0x1661/0x2860
       lock_acquire+0xc4/0x2f0
       __flush_work+0x27a/0x660
       flush_delayed_work+0x5d/0xa0
       dev_coredump_put+0x63/0xa0
       xe_driver_devcoredump_fini+0x12/0x20 [xe]
       devm_action_release+0x12/0x30
       release_nodes+0x3a/0x120
       devres_release_all+0x8a/0xd0
       device_unbind_cleanup+0x12/0x80
       device_release_driver_internal+0x23a/0x280
       device_driver_detach+0x14/0x20
       unbind_store+0xaf/0xc0
       drv_attr_store+0x21/0x50
       sysfs_kf_write+0x4a/0x80
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
other info that might help us debug this:
Chain exists of: (work_completion)(&(&devcd->del_wk)->work) --> kn->active#236 --> &devcd->mutex
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&devcd->mutex);
                               lock(kn->active#236);
                               lock(&devcd->mutex);
  lock((work_completion)(&(&devcd->del_wk)->work));
 *** DEADLOCK ***
5 locks held by xe_fault_inject/5091:
 #0: ffff8881129f9488 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x72/0xf0
 #1: ffff88810c755078 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x123/0x220
 #2: ffff8881054811a0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x55/0x280
 #3: ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
 #4: ffffffff8359e020 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x660
stack backtrace:
CPU: 14 UID: 0 PID: 5091 Comm: xe_fault_inject Tainted: G S   U              6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 PREEMPT_{RT,(lazy)}
Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.10 12/13/2021
Call Trace:
 <TASK>
 dump_stack_lvl+0x91/0xf0
 dump_stack+0x10/0x20
 print_circular_bug+0x285/0x360
 check_noncircular+0x135/0x150
 ? register_lock_class+0x48/0x4a0
 __lock_acquire+0x1661/0x2860
 lock_acquire+0xc4/0x2f0
 ? __flush_work+0x25d/0x660
 ? mark_held_locks+0x46/0x90
 ? __flush_work+0x25d/0x660
 __flush_work+0x27a/0x660
 ? __flush_work+0x25d/0x660
 ? trace_hardirqs_on+0x1e/0xd0
 ? __pfx_wq_barrier_func+0x10/0x10
 flush_delayed_work+0x5d/0xa0
 dev_coredump_put+0x63/0xa0
 xe_driver_devcoredump_fini+0x12/0x20 [xe]
 devm_action_release+0x12/0x30
 release_nodes+0x3a/0x120
 devres_release_all+0x8a/0xd0
 device_unbind_cleanup+0x12/0x80
 device_release_driver_internal+0x23a/0x280
 ? bus_find_device+0xa8/0xe0
 device_driver_detach+0x14/0x20
 unbind_store+0xaf/0xc0
 drv_attr_store+0x21/0x50
 sysfs_kf_write+0x4a/0x80
 kernfs_fop_write_iter+0x169/0x220
 vfs_write+0x293/0x560
 ksys_write+0x72/0xf0
 __x64_sys_write+0x19/0x30
 x64_sys_call+0x2bf/0x2660
 do_syscall_64+0x93/0xb60
 ? __f_unlock_pos+0x15/0x20
 ? __x64_sys_getdents64+0x9b/0x130
 ? __pfx_filldir64+0x10/0x10
 ? do_syscall_64+0x1a2/0xb60
 ? clear_bhb_loop+0x30/0x80
 ? clear_bhb_loop+0x30/0x80
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x76e292edd574
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
RSP: 002b:00007fffe247a828 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076e292edd574
RDX: 000000000000000c RSI: 00006267f6306063 RDI: 000000000000000b
RBP: 000000000000000c R08: 000076e292fc4b20 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 00006267f6306063
R13: 000000000000000b R14: 00006267e6859c00 R15: 000076e29322a000
 </TASK>
xe 0000:03:00.0: [drm] Xe device coredump has been deleted.

Fixes: 01daccf ("devcoredump : Serialize devcd_del work")
Cc: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v6.1+
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250723142416.1020423-1-dev@lankhorst.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
github-actions bot pushed a commit that referenced this pull request Oct 29, 2025
JIRA: https://issues.redhat.com/browse/RHEL-115591
Upstream Status: linux.git
Conflicts:
  * (context) Missing upstream commit 5cde39e ("vxlan: Rename FDB
    Txlookup function"):
    The vxlan_find_mac_tx() function is still called "vxlan_find_mac"
    in Centos Stream 10.

commit 1f5d2fd
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Mon Sep 1 09:50:34 2025 +0300

    vxlan: Fix NPD in {arp,neigh}_reduce() when using nexthop objects

    When the "proxy" option is enabled on a VXLAN device, the device will
    suppress ARP requests and IPv6 Neighbor Solicitation messages if it is
    able to reply on behalf of the remote host. That is, if a matching and
    valid neighbor entry is configured on the VXLAN device whose MAC address
    is not behind the "any" remote (0.0.0.0 / ::).

    The code currently assumes that the FDB entry for the neighbor's MAC
    address points to a valid remote destination, but this is incorrect if
    the entry is associated with an FDB nexthop group. This can result in a
    NPD [1][3] which can be reproduced using [2][4].

    Fix by checking that the remote destination exists before dereferencing
    it.

    [1]
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    [...]
    CPU: 4 UID: 0 PID: 365 Comm: arping Not tainted 6.17.0-rc2-virtme-g2a89cb21162c #2 PREEMPT(voluntary)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-4.fc41 04/01/2014
    RIP: 0010:vxlan_xmit+0xb58/0x15f0
    [...]
    Call Trace:
     <TASK>
     dev_hard_start_xmit+0x5d/0x1c0
     __dev_queue_xmit+0x246/0xfd0
     packet_sendmsg+0x113a/0x1850
     __sock_sendmsg+0x38/0x70
     __sys_sendto+0x126/0x180
     __x64_sys_sendto+0x24/0x30
     do_syscall_64+0xa4/0x260
     entry_SYSCALL_64_after_hwframe+0x4b/0x53

    [2]
     #!/bin/bash

     ip address add 192.0.2.1/32 dev lo

     ip nexthop add id 1 via 192.0.2.2 fdb
     ip nexthop add id 10 group 1 fdb

     ip link add name vx0 up type vxlan id 10010 local 192.0.2.1 dstport 4789 proxy

     ip neigh add 192.0.2.3 lladdr 00:11:22:33:44:55 nud perm dev vx0

     bridge fdb add 00:11:22:33:44:55 dev vx0 self static nhid 10

     arping -b -c 1 -s 192.0.2.1 -I vx0 192.0.2.3

    [3]
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    [...]
    CPU: 13 UID: 0 PID: 372 Comm: ndisc6 Not tainted 6.17.0-rc2-virtmne-g6ee90cb26014 #3 PREEMPT(voluntary)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1v996), BIOS 1.17.0-4.fc41 04/01/2x014
    RIP: 0010:vxlan_xmit+0x803/0x1600
    [...]
    Call Trace:
     <TASK>
     dev_hard_start_xmit+0x5d/0x1c0
     __dev_queue_xmit+0x246/0xfd0
     ip6_finish_output2+0x210/0x6c0
     ip6_finish_output+0x1af/0x2b0
     ip6_mr_output+0x92/0x3e0
     ip6_send_skb+0x30/0x90
     rawv6_sendmsg+0xe6e/0x12e0
     __sock_sendmsg+0x38/0x70
     __sys_sendto+0x126/0x180
     __x64_sys_sendto+0x24/0x30
     do_syscall_64+0xa4/0x260
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    RIP: 0033:0x7f383422ec77

    [4]
     #!/bin/bash

     ip address add 2001:db8:1::1/128 dev lo

     ip nexthop add id 1 via 2001:db8:1::1 fdb
     ip nexthop add id 10 group 1 fdb

     ip link add name vx0 up type vxlan id 10010 local 2001:db8:1::1 dstport 4789 proxy

     ip neigh add 2001:db8:1::3 lladdr 00:11:22:33:44:55 nud perm dev vx0

     bridge fdb add 00:11:22:33:44:55 dev vx0 self static nhid 10

     ndisc6 -r 1 -s 2001:db8:1::1 -w 1 2001:db8:1::3 vx0

    Fixes: 1274e1c ("vxlan: ecmp support for mac fdb entries")
    Reviewed-by: Petr Machata <petrm@nvidia.com>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
    Link: https://patch.msgid.link/20250901065035.159644-3-idosch@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Guillaume Nault <gnault@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 30, 2025
JIRA: https://issues.redhat.com/browse/RHEL-115358

commit c98cc97
Author: Steven Rostedt <rostedt@goodmis.org>
Date:   Tue May 27 10:58:20 2025 -0400

    ring-buffer: Move cpus_read_lock() outside of buffer->mutex

    Running a modified trace-cmd record --nosplice where it does a mmap of the
    ring buffer when '--nosplice' is set, caused the following lockdep splat:

     ======================================================
     WARNING: possible circular locking dependency detected
     6.15.0-rc7-test-00002-gfb7d03d8a82f #551 Not tainted
     ------------------------------------------------------
     trace-cmd/1113 is trying to acquire lock:
     ffff888100062888 (&buffer->mutex){+.+.}-{4:4}, at: ring_buffer_map+0x11c/0xe70

     but task is already holding lock:
     ffff888100a5f9f8 (&cpu_buffer->mapping_lock){+.+.}-{4:4}, at: ring_buffer_map+0xcf/0xe70

     which lock already depends on the new lock.

     the existing dependency chain (in reverse order) is:

     -> #5 (&cpu_buffer->mapping_lock){+.+.}-{4:4}:
            __mutex_lock+0x192/0x18c0
            ring_buffer_map+0xcf/0xe70
            tracing_buffers_mmap+0x1c4/0x3b0
            __mmap_region+0xd8d/0x1f70
            do_mmap+0x9d7/0x1010
            vm_mmap_pgoff+0x20b/0x390
            ksys_mmap_pgoff+0x2e9/0x440
            do_syscall_64+0x79/0x1c0
            entry_SYSCALL_64_after_hwframe+0x76/0x7e

     -> #4 (&mm->mmap_lock){++++}-{4:4}:
            __might_fault+0xa5/0x110
            _copy_to_user+0x22/0x80
            _perf_ioctl+0x61b/0x1b70
            perf_ioctl+0x62/0x90
            __x64_sys_ioctl+0x134/0x190
            do_syscall_64+0x79/0x1c0
            entry_SYSCALL_64_after_hwframe+0x76/0x7e

     -> #3 (&cpuctx_mutex){+.+.}-{4:4}:
            __mutex_lock+0x192/0x18c0
            perf_event_init_cpu+0x325/0x7c0
            perf_event_init+0x52a/0x5b0
            start_kernel+0x263/0x3e0
            x86_64_start_reservations+0x24/0x30
            x86_64_start_kernel+0x95/0xa0
            common_startup_64+0x13e/0x141

     -> #2 (pmus_lock){+.+.}-{4:4}:
            __mutex_lock+0x192/0x18c0
            perf_event_init_cpu+0xb7/0x7c0
            cpuhp_invoke_callback+0x2c0/0x1030
            __cpuhp_invoke_callback_range+0xbf/0x1f0
            _cpu_up+0x2e7/0x690
            cpu_up+0x117/0x170
            cpuhp_bringup_mask+0xd5/0x120
            bringup_nonboot_cpus+0x13d/0x170
            smp_init+0x2b/0xf0
            kernel_init_freeable+0x441/0x6d0
            kernel_init+0x1e/0x160
            ret_from_fork+0x34/0x70
            ret_from_fork_asm+0x1a/0x30

     -> #1 (cpu_hotplug_lock){++++}-{0:0}:
            cpus_read_lock+0x2a/0xd0
            ring_buffer_resize+0x610/0x14e0
            __tracing_resize_ring_buffer.part.0+0x42/0x120
            tracing_set_tracer+0x7bd/0xa80
            tracing_set_trace_write+0x132/0x1e0
            vfs_write+0x21c/0xe80
            ksys_write+0xf9/0x1c0
            do_syscall_64+0x79/0x1c0
            entry_SYSCALL_64_after_hwframe+0x76/0x7e

     -> #0 (&buffer->mutex){+.+.}-{4:4}:
            __lock_acquire+0x1405/0x2210
            lock_acquire+0x174/0x310
            __mutex_lock+0x192/0x18c0
            ring_buffer_map+0x11c/0xe70
            tracing_buffers_mmap+0x1c4/0x3b0
            __mmap_region+0xd8d/0x1f70
            do_mmap+0x9d7/0x1010
            vm_mmap_pgoff+0x20b/0x390
            ksys_mmap_pgoff+0x2e9/0x440
            do_syscall_64+0x79/0x1c0
            entry_SYSCALL_64_after_hwframe+0x76/0x7e

     other info that might help us debug this:

     Chain exists of:
       &buffer->mutex --> &mm->mmap_lock --> &cpu_buffer->mapping_lock

      Possible unsafe locking scenario:

            CPU0                    CPU1
            ----                    ----
       lock(&cpu_buffer->mapping_lock);
                                    lock(&mm->mmap_lock);
                                    lock(&cpu_buffer->mapping_lock);
       lock(&buffer->mutex);

      *** DEADLOCK ***

     2 locks held by trace-cmd/1113:
      #0: ffff888106b847e0 (&mm->mmap_lock){++++}-{4:4}, at: vm_mmap_pgoff+0x192/0x390
      #1: ffff888100a5f9f8 (&cpu_buffer->mapping_lock){+.+.}-{4:4}, at: ring_buffer_map+0xcf/0xe70

     stack backtrace:
     CPU: 5 UID: 0 PID: 1113 Comm: trace-cmd Not tainted 6.15.0-rc7-test-00002-gfb7d03d8a82f #551 PREEMPT
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
     Call Trace:
      <TASK>
      dump_stack_lvl+0x6e/0xa0
      print_circular_bug.cold+0x178/0x1be
      check_noncircular+0x146/0x160
      __lock_acquire+0x1405/0x2210
      lock_acquire+0x174/0x310
      ? ring_buffer_map+0x11c/0xe70
      ? ring_buffer_map+0x11c/0xe70
      ? __mutex_lock+0x169/0x18c0
      __mutex_lock+0x192/0x18c0
      ? ring_buffer_map+0x11c/0xe70
      ? ring_buffer_map+0x11c/0xe70
      ? function_trace_call+0x296/0x370
      ? __pfx___mutex_lock+0x10/0x10
      ? __pfx_function_trace_call+0x10/0x10
      ? __pfx___mutex_lock+0x10/0x10
      ? _raw_spin_unlock+0x2d/0x50
      ? ring_buffer_map+0x11c/0xe70
      ? ring_buffer_map+0x11c/0xe70
      ? __mutex_lock+0x5/0x18c0
      ring_buffer_map+0x11c/0xe70
      ? do_raw_spin_lock+0x12d/0x270
      ? find_held_lock+0x2b/0x80
      ? _raw_spin_unlock+0x2d/0x50
      ? rcu_is_watching+0x15/0xb0
      ? _raw_spin_unlock+0x2d/0x50
      ? trace_preempt_on+0xd0/0x110
      tracing_buffers_mmap+0x1c4/0x3b0
      __mmap_region+0xd8d/0x1f70
      ? ring_buffer_lock_reserve+0x99/0xff0
      ? __pfx___mmap_region+0x10/0x10
      ? ring_buffer_lock_reserve+0x99/0xff0
      ? __pfx_ring_buffer_lock_reserve+0x10/0x10
      ? __pfx_ring_buffer_lock_reserve+0x10/0x10
      ? bpf_lsm_mmap_addr+0x4/0x10
      ? security_mmap_addr+0x46/0xd0
      ? lock_is_held_type+0xd9/0x130
      do_mmap+0x9d7/0x1010
      ? 0xffffffffc0370095
      ? __pfx_do_mmap+0x10/0x10
      vm_mmap_pgoff+0x20b/0x390
      ? __pfx_vm_mmap_pgoff+0x10/0x10
      ? 0xffffffffc0370095
      ksys_mmap_pgoff+0x2e9/0x440
      do_syscall_64+0x79/0x1c0
      entry_SYSCALL_64_after_hwframe+0x76/0x7e
     RIP: 0033:0x7fb0963a7de2
     Code: 00 00 00 0f 1f 44 00 00 41 f7 c1 ff 0f 00 00 75 27 55 89 cd 53 48 89 fb 48 85 ff 74 3b 41 89 ea 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 5b 5d c3 0f 1f 00 48 8b 05 e1 9f 0d 00 64
     RSP: 002b:00007ffdcc8fb878 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
     RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb0963a7de2
     RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
     RBP: 0000000000000001 R08: 0000000000000006 R09: 0000000000000000
     R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
     R13: 00007ffdcc8fbe68 R14: 00007fb096628000 R15: 00005633e01a5c90
      </TASK>

    The issue is that cpus_read_lock() is taken within buffer->mutex. The
    memory mapped pages are taken with the mmap_lock held. The buffer->mutex
    is taken within the cpu_buffer->mapping_lock. There's quite a chain with
    all these locks, where the deadlock can be fixed by moving the
    cpus_read_lock() outside the taking of the buffer->mutex.

    Cc: stable@vger.kernel.org
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Vincent Donnefort <vdonnefort@google.com>
    Link: https://lore.kernel.org/20250527105820.0f45d045@gandalf.local.home
    Fixes: 117c392 ("ring-buffer: Introducing ring-buffer mapping functions")
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
github-actions bot pushed a commit that referenced this pull request Oct 30, 2025
[ Upstream commit a91c809 ]

The original code causes a circular locking dependency found by lockdep.

======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 Tainted: G S   U
------------------------------------------------------
xe_fault_inject/5091 is trying to acquire lock:
ffff888156815688 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}, at: __flush_work+0x25d/0x660

but task is already holding lock:

ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&devcd->mutex){+.+.}-{3:3}:
       mutex_lock_nested+0x4e/0xc0
       devcd_data_write+0x27/0x90
       sysfs_kf_bin_write+0x80/0xf0
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (kn->active#236){++++}-{0:0}:
       kernfs_drain+0x1e2/0x200
       __kernfs_remove+0xae/0x400
       kernfs_remove_by_name_ns+0x5d/0xc0
       remove_files+0x54/0x70
       sysfs_remove_group+0x3d/0xa0
       sysfs_remove_groups+0x2e/0x60
       device_remove_attrs+0xc7/0x100
       device_del+0x15d/0x3b0
       devcd_del+0x19/0x30
       process_one_work+0x22b/0x6f0
       worker_thread+0x1e8/0x3d0
       kthread+0x11c/0x250
       ret_from_fork+0x26c/0x2e0
       ret_from_fork_asm+0x1a/0x30
-> #0 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}:
       __lock_acquire+0x1661/0x2860
       lock_acquire+0xc4/0x2f0
       __flush_work+0x27a/0x660
       flush_delayed_work+0x5d/0xa0
       dev_coredump_put+0x63/0xa0
       xe_driver_devcoredump_fini+0x12/0x20 [xe]
       devm_action_release+0x12/0x30
       release_nodes+0x3a/0x120
       devres_release_all+0x8a/0xd0
       device_unbind_cleanup+0x12/0x80
       device_release_driver_internal+0x23a/0x280
       device_driver_detach+0x14/0x20
       unbind_store+0xaf/0xc0
       drv_attr_store+0x21/0x50
       sysfs_kf_write+0x4a/0x80
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
other info that might help us debug this:
Chain exists of: (work_completion)(&(&devcd->del_wk)->work) --> kn->active#236 --> &devcd->mutex
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&devcd->mutex);
                               lock(kn->active#236);
                               lock(&devcd->mutex);
  lock((work_completion)(&(&devcd->del_wk)->work));
 *** DEADLOCK ***
5 locks held by xe_fault_inject/5091:
 #0: ffff8881129f9488 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x72/0xf0
 #1: ffff88810c755078 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x123/0x220
 #2: ffff8881054811a0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x55/0x280
 #3: ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
 #4: ffffffff8359e020 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x660
stack backtrace:
CPU: 14 UID: 0 PID: 5091 Comm: xe_fault_inject Tainted: G S   U              6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 PREEMPT_{RT,(lazy)}
Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.10 12/13/2021
Call Trace:
 <TASK>
 dump_stack_lvl+0x91/0xf0
 dump_stack+0x10/0x20
 print_circular_bug+0x285/0x360
 check_noncircular+0x135/0x150
 ? register_lock_class+0x48/0x4a0
 __lock_acquire+0x1661/0x2860
 lock_acquire+0xc4/0x2f0
 ? __flush_work+0x25d/0x660
 ? mark_held_locks+0x46/0x90
 ? __flush_work+0x25d/0x660
 __flush_work+0x27a/0x660
 ? __flush_work+0x25d/0x660
 ? trace_hardirqs_on+0x1e/0xd0
 ? __pfx_wq_barrier_func+0x10/0x10
 flush_delayed_work+0x5d/0xa0
 dev_coredump_put+0x63/0xa0
 xe_driver_devcoredump_fini+0x12/0x20 [xe]
 devm_action_release+0x12/0x30
 release_nodes+0x3a/0x120
 devres_release_all+0x8a/0xd0
 device_unbind_cleanup+0x12/0x80
 device_release_driver_internal+0x23a/0x280
 ? bus_find_device+0xa8/0xe0
 device_driver_detach+0x14/0x20
 unbind_store+0xaf/0xc0
 drv_attr_store+0x21/0x50
 sysfs_kf_write+0x4a/0x80
 kernfs_fop_write_iter+0x169/0x220
 vfs_write+0x293/0x560
 ksys_write+0x72/0xf0
 __x64_sys_write+0x19/0x30
 x64_sys_call+0x2bf/0x2660
 do_syscall_64+0x93/0xb60
 ? __f_unlock_pos+0x15/0x20
 ? __x64_sys_getdents64+0x9b/0x130
 ? __pfx_filldir64+0x10/0x10
 ? do_syscall_64+0x1a2/0xb60
 ? clear_bhb_loop+0x30/0x80
 ? clear_bhb_loop+0x30/0x80
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x76e292edd574
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
RSP: 002b:00007fffe247a828 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076e292edd574
RDX: 000000000000000c RSI: 00006267f6306063 RDI: 000000000000000b
RBP: 000000000000000c R08: 000076e292fc4b20 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 00006267f6306063
R13: 000000000000000b R14: 00006267e6859c00 R15: 000076e29322a000
 </TASK>
xe 0000:03:00.0: [drm] Xe device coredump has been deleted.

Fixes: 01daccf ("devcoredump : Serialize devcd_del work")
Cc: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v6.1+
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250723142416.1020423-1-dev@lankhorst.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ removed const qualifier from bin_attribute callback parameters ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
roxanan1996 added a commit that referenced this pull request Oct 30, 2025
jira VULN-70531
cve CVE-2022-50070
commit-author Paolo Abeni <pabeni@redhat.com>
commit c886d70

Dipanjan reported a syzbot splat at close time:

WARNING: CPU: 1 PID: 10818 at net/ipv4/af_inet.c:153
inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153
Modules linked in: uio_ivshmem(OE) uio(E)
CPU: 1 PID: 10818 Comm: kworker/1:16 Tainted: G           OE
5.19.0-rc6-g2eae0556bb9d #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153
Code: 21 02 00 00 41 8b 9c 24 28 02 00 00 e9 07 ff ff ff e8 34 4d 91
f9 89 ee 4c 89 e7 e8 4a 47 60 ff e9 a6 fc ff ff e8 20 4d 91 f9 <0f> 0b
e9 84 fe ff ff e8 14 4d 91 f9 0f 0b e9 d4 fd ff ff e8 08 4d
RSP: 0018:ffffc9001b35fa78 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000002879d0 RCX: ffff8881326f3b00
RDX: 0000000000000000 RSI: ffff8881326f3b00 RDI: 0000000000000002
RBP: ffff888179662674 R08: ffffffff87e983a0 R09: 0000000000000000
R10: 0000000000000005 R11: 00000000000004ea R12: ffff888179662400
R13: ffff888179662428 R14: 0000000000000001 R15: ffff88817e38e258
FS:  0000000000000000(0000) GS:ffff8881f5f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020007bc0 CR3: 0000000179592000 CR4: 0000000000150ee0
Call Trace:
 <TASK>
 __sk_destruct+0x4f/0x8e0 net/core/sock.c:2067
 sk_destruct+0xbd/0xe0 net/core/sock.c:2112
 __sk_free+0xef/0x3d0 net/core/sock.c:2123
 sk_free+0x78/0xa0 net/core/sock.c:2134
 sock_put include/net/sock.h:1927 [inline]
 __mptcp_close_ssk+0x50f/0x780 net/mptcp/protocol.c:2351
 __mptcp_destroy_sock+0x332/0x760 net/mptcp/protocol.c:2828
 mptcp_worker+0x5d2/0xc90 net/mptcp/protocol.c:2586
 process_one_work+0x9cc/0x1650 kernel/workqueue.c:2289
 worker_thread+0x623/0x1070 kernel/workqueue.c:2436
 kthread+0x2e9/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
 </TASK>

The root cause of the problem is that an mptcp-level (re)transmit can
race with mptcp_close() and the packet scheduler checks the subflow
state before acquiring the socket lock: we can try to (re)transmit on
an already closed ssk.

Fix the issue checking again the subflow socket status under the
subflow socket lock protection. Additionally add the missing check
for the fallback-to-tcp case.

Fixes: d5f4919 ("mptcp: allow picking different xmit subflows")
	Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com>
	Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>
	Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
	Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c886d70)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
github-actions bot pushed a commit that referenced this pull request Oct 31, 2025
JIRA: https://issues.redhat.com/browse/RHEL-120705
Upstream Status: 5d627a9

Conflict(s):
  Checking file drivers/pci/controller/pcie-apple.c: Hunk #2 FAILED at 134.
  Due to upstream patch not taking into account the former commit
  31fccd6e85d7 "PCI: apple: Abstract register offsets via a SoC-specific
  structure".


commit 5d627a9
Author: Marc Zyngier <maz@kernel.org>
Date:   Tue May 13 18:28:17 2025 +0100

    PCI: apple: Convert to MSI parent infrastructure

    In an effort to move ARM64 away from the legacy MSI setup, convert the
    Apple PCIe driver to the MSI-parent infrastructure and let each device have
    its own MSI domain.

    [ tglx: Moved the struct out of the function call argument ]

    Signed-off-by: Marc Zyngier <maz@kernel.org>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
    Link: https://lore.kernel.org/all/20250513172819.2216709-8-maz@kernel.org

Signed-off-by: Myron Stowe <mstowe@redhat.com>
roxanan1996 added a commit that referenced this pull request Oct 31, 2025
jira VULN-70531
cve CVE-2022-50070
commit-author Paolo Abeni <pabeni@redhat.com>
commit c886d70

Dipanjan reported a syzbot splat at close time:

WARNING: CPU: 1 PID: 10818 at net/ipv4/af_inet.c:153
inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153
Modules linked in: uio_ivshmem(OE) uio(E)
CPU: 1 PID: 10818 Comm: kworker/1:16 Tainted: G           OE
5.19.0-rc6-g2eae0556bb9d #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153
Code: 21 02 00 00 41 8b 9c 24 28 02 00 00 e9 07 ff ff ff e8 34 4d 91
f9 89 ee 4c 89 e7 e8 4a 47 60 ff e9 a6 fc ff ff e8 20 4d 91 f9 <0f> 0b
e9 84 fe ff ff e8 14 4d 91 f9 0f 0b e9 d4 fd ff ff e8 08 4d
RSP: 0018:ffffc9001b35fa78 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000002879d0 RCX: ffff8881326f3b00
RDX: 0000000000000000 RSI: ffff8881326f3b00 RDI: 0000000000000002
RBP: ffff888179662674 R08: ffffffff87e983a0 R09: 0000000000000000
R10: 0000000000000005 R11: 00000000000004ea R12: ffff888179662400
R13: ffff888179662428 R14: 0000000000000001 R15: ffff88817e38e258
FS:  0000000000000000(0000) GS:ffff8881f5f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020007bc0 CR3: 0000000179592000 CR4: 0000000000150ee0
Call Trace:
 <TASK>
 __sk_destruct+0x4f/0x8e0 net/core/sock.c:2067
 sk_destruct+0xbd/0xe0 net/core/sock.c:2112
 __sk_free+0xef/0x3d0 net/core/sock.c:2123
 sk_free+0x78/0xa0 net/core/sock.c:2134
 sock_put include/net/sock.h:1927 [inline]
 __mptcp_close_ssk+0x50f/0x780 net/mptcp/protocol.c:2351
 __mptcp_destroy_sock+0x332/0x760 net/mptcp/protocol.c:2828
 mptcp_worker+0x5d2/0xc90 net/mptcp/protocol.c:2586
 process_one_work+0x9cc/0x1650 kernel/workqueue.c:2289
 worker_thread+0x623/0x1070 kernel/workqueue.c:2436
 kthread+0x2e9/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
 </TASK>

The root cause of the problem is that an mptcp-level (re)transmit can
race with mptcp_close() and the packet scheduler checks the subflow
state before acquiring the socket lock: we can try to (re)transmit on
an already closed ssk.

Fix the issue checking again the subflow socket status under the
subflow socket lock protection. Additionally add the missing check
for the fallback-to-tcp case.

Fixes: d5f4919 ("mptcp: allow picking different xmit subflows")
	Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com>
	Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>
	Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
	Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c886d70)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
github-actions bot pushed a commit that referenced this pull request Nov 2, 2025
JIRA: https://issues.redhat.com/browse/RHEL-78200

upstream
========
commit ea04fe1
Author: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Date: Tue Apr 29 12:21:32 2025 +0530

description
===========
pert script tests fails with segmentation fault as below:

  92: perf script tests:
  --- start ---
  test child forked, pid 103769
  DB test
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB /tmp/perf-test-script.7rbftEpOzX/perf.data (9 samples) ]
  /usr/libexec/perf-core/tests/shell/script.sh: line 35:
  103780 Segmentation fault      (core dumped)
  perf script -i "${perfdatafile}" -s "${db_test}"
  --- Cleaning up ---
  ---- end(-1) ----
  92: perf script tests                                               : FAILED!

Backtrace pointed to :
	#0  0x0000000010247dd0 in maps.machine ()
	#1  0x00000000101d178c in db_export.sample ()
	#2  0x00000000103412c8 in python_process_event ()
	#3  0x000000001004eb28 in process_sample_event ()
	#4  0x000000001024fcd0 in machines.deliver_event ()
	#5  0x000000001025005c in perf_session.deliver_event ()
	#6  0x00000000102568b0 in __ordered_events__flush.part.0 ()
	#7  0x0000000010251618 in perf_session.process_events ()
	#8  0x0000000010053620 in cmd_script ()
	#9  0x00000000100b5a28 in run_builtin ()
	#10 0x00000000100b5f94 in handle_internal_command ()
	#11 0x0000000010011114 in main ()

Further investigation reveals that this occurs in the `perf script tests`,
because it uses `db_test.py` script. This script sets `perf_db_export_mode = True`.

With `perf_db_export_mode` enabled, if a sample originates from a hypervisor,
perf doesn't set maps for "[H]" sample in the code. Consequently, `al->maps` remains NULL
when `maps__machine(al->maps)` is called from `db_export__sample`.

As al->maps can be NULL in case of Hypervisor samples , use thread->maps
because even for Hypervisor sample, machine should exist.
If we don't have machine for some reason, return -1 to avoid segmentation fault.

    Reported-by: Disha Goel <disgoel@linux.ibm.com>
    Signed-off-by: Aditya Bodkhe <aditya.b1@linux.ibm.com>
    Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
    Tested-by: Disha Goel <disgoel@linux.ibm.com>
    Link: https://lore.kernel.org/r/20250429065132.36839-1-adityab1@linux.ibm.com
    Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
github-actions bot pushed a commit that referenced this pull request Nov 2, 2025
JIRA: https://issues.redhat.com/browse/RHEL-78200

upstream
========
commit c21986d
Author: Sergei Trofimovich <slyich@gmail.com>
Date: Mon May 5 18:44:19 2025 +0100

description
===========
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:

    $ perf record -a -g --call-graph=dwarf
    $ perf report
    # hung

`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`

    $ strace -y -f -p 2780484
    strace: Process 2780484 attached
    pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached

It's call trace descends into `elfutils`:

    $ gdb -p 2780484
    (gdb) bt
    #0  0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
        at ../sysdeps/unix/sysv/linux/pread64.c:25
    #1  0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
    #2  0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #3  0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #4  0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
       from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #5  0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
        at util/dso.h:537
    #6  0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
    #7  frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
    #8  0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #9  0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
        thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
        best_effort=best_effort@entry=false) at util/thread.h:152
    #13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
        sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
    #14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
        max_stack=127, symbols=true) at util/machine.c:2920
    #15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
        sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
        at util/machine.c:2970
    #16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
        sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
    #17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
        evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
    #18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
        arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
    #19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
        evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
    #20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
        file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
    #21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
    #22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
    #23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
        file_path=0x105ffbf0 "perf.data") at util/session.c:1419
    #24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
    --Type <RET> for more, q to quit, c to continue without paging--
    quit
        prog=prog@entry=0x7fff9df81220) at util/session.c:2132
    #25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
        at util/session.c:2181
    #26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
    #27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
    #28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
    #29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
    #30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
        at perf.c:351
    #31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
    #32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
    #33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556

The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.

The change conservatively skips all non-regular files.

    Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
    Acked-by: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20250505174419.2814857-1-slyich@gmail.com
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
github-actions bot pushed a commit that referenced this pull request Nov 2, 2025
JIRA: https://issues.redhat.com/browse/RHEL-78200

upstream
========
commit 9c9f4a2
Author: Ian Rogers <irogers@google.com>
Date: Tue Jun 24 14:05:00 2025 -0700

description
===========
Symbolize stack traces by creating a live machine. Add this
functionality to dump_stack and switch dump_stack users to use
it. Switch TUI to use it. Add stack traces to the child test function
which can be useful to diagnose blocked code.

Example output:
```
$ perf test -vv PERF_RECORD_
...
  7: PERF_RECORD_* events & perf_sample fields:
  7: PERF_RECORD_* events & perf_sample fields                       : Running (1 active)
^C
Signal (2) while running tests.
Terminating tests with the same signal
Internal test harness failure. Completing any started tests:
:  7: PERF_RECORD_* events & perf_sample fields:

---- unexpected signal (2) ----
    #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
    #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
    #2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
    #3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
    #4 0x7fc12fef1393 in __nanosleep nanosleep.c:26
    #5 0x7fc12ff02d68 in __sleep sleep.c:55
    #6 0x55788c63196b in test__PERF_RECORD perf-record.c:0
    #7 0x55788c620fb0 in run_test_child builtin-test.c:0
    #8 0x55788c5bd18d in start_command run-command.c:127
    #9 0x55788c621ef3 in __cmd_test builtin-test.c:0
    #10 0x55788c6225bf in cmd_test ??:0
    #11 0x55788c5afbd0 in run_builtin perf.c:0
    #12 0x55788c5afeeb in handle_internal_command perf.c:0
    #13 0x55788c52b383 in main ??:0
    #14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
    #15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
    #16 0x55788c52b9d1 in _start ??:0

---- unexpected signal (2) ----
    #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
    #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
    #2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45
    #3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26
    #4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36
    #5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0
    #6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
    #7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
    #8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
    #9 0x7fc12fef1393 in __nanosleep nanosleep.c:26
    #10 0x7fc12ff02d68 in __sleep sleep.c:55
    #11 0x55788c63196b in test__PERF_RECORD perf-record.c:0
    #12 0x55788c620fb0 in run_test_child builtin-test.c:0
    #13 0x55788c5bd18d in start_command run-command.c:127
    #14 0x55788c621ef3 in __cmd_test builtin-test.c:0
    #15 0x55788c6225bf in cmd_test ??:0
    #16 0x55788c5afbd0 in run_builtin perf.c:0
    #17 0x55788c5afeeb in handle_internal_command perf.c:0
    #18 0x55788c52b383 in main ??:0
    #19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
    #20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
    #21 0x55788c52b9d1 in _start ??:0
  7: PERF_RECORD_* events & perf_sample fields                       : Skip (permissions)
```

    Signed-off-by: Ian Rogers <irogers@google.com>
    Link: https://lore.kernel.org/r/20250624210500.2121303-1-irogers@google.com
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
github-actions bot pushed a commit that referenced this pull request Nov 2, 2025
JIRA: https://issues.redhat.com/browse/RHEL-78200

upstream
========
commit c72bf82
Author: Thomas Falcon <thomas.falcon@intel.com>
Date: Thu Jun 12 11:36:59 2025 -0500

description
===========
Calling perf top with branch filters enabled on Intel CPU's
with branch counters logging (A.K.A LBR event logging [1]) support
results in a segfault.

$ perf top  -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter
...
Thread 27 "perf" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffafff76c0 (LWP 949003)]
perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
653			*width = env->cpu_pmu_caps ? env->br_cntr_width :
(gdb) bt
 #0  perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
 #1  0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345
 #2  0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389
 #3  0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422
 #4  0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850
 #5  0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737
 #6  0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359
 #7  0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845
 #8  0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211
 #9  0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245
 #10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
 #11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342
 #12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120
 #13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448
 #14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

The cause is that perf_env__find_br_cntr_info tries to access a
null pointer pmu_caps in the perf_env struct. A similar issue exists
for homogeneous core systems which use the cpu_pmu_caps structure.

Fix this by populating cpu_pmu_caps and pmu_caps structures with
values from sysfs when calling perf top with branch stack sampling
enabled.

[1], LBR event logging introduced here:
https://lore.kernel.org/all/20231025201626.3000228-5-kan.liang@linux.intel.com/

    Reviewed-by: Ian Rogers <irogers@google.com>
    Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
    Link: https://lore.kernel.org/r/20250612163659.1357950-2-thomas.falcon@intel.com
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
github-actions bot pushed a commit that referenced this pull request Nov 4, 2025
JIRA: https://issues.redhat.com/browse/RHEL-115639
Upstream Status: linux.git
Conflicts: (context) Missing upstream commit 5cde39e ("vxlan:
           Rename FDB Txlookup function"):
           The vxlan_find_mac_tx() function is still called
           "vxlan_find_mac" in Centos Stream 10.

commit 1f5d2fd
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Mon Sep 1 09:50:34 2025 +0300

    vxlan: Fix NPD in {arp,neigh}_reduce() when using nexthop objects

    When the "proxy" option is enabled on a VXLAN device, the device will
    suppress ARP requests and IPv6 Neighbor Solicitation messages if it is
    able to reply on behalf of the remote host. That is, if a matching and
    valid neighbor entry is configured on the VXLAN device whose MAC address
    is not behind the "any" remote (0.0.0.0 / ::).

    The code currently assumes that the FDB entry for the neighbor's MAC
    address points to a valid remote destination, but this is incorrect if
    the entry is associated with an FDB nexthop group. This can result in a
    NPD [1][3] which can be reproduced using [2][4].

    Fix by checking that the remote destination exists before dereferencing
    it.

    [1]
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    [...]
    CPU: 4 UID: 0 PID: 365 Comm: arping Not tainted 6.17.0-rc2-virtme-g2a89cb21162c #2 PREEMPT(voluntary)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-4.fc41 04/01/2014
    RIP: 0010:vxlan_xmit+0xb58/0x15f0
    [...]
    Call Trace:
     <TASK>
     dev_hard_start_xmit+0x5d/0x1c0
     __dev_queue_xmit+0x246/0xfd0
     packet_sendmsg+0x113a/0x1850
     __sock_sendmsg+0x38/0x70
     __sys_sendto+0x126/0x180
     __x64_sys_sendto+0x24/0x30
     do_syscall_64+0xa4/0x260
     entry_SYSCALL_64_after_hwframe+0x4b/0x53

    [2]
     #!/bin/bash

     ip address add 192.0.2.1/32 dev lo

     ip nexthop add id 1 via 192.0.2.2 fdb
     ip nexthop add id 10 group 1 fdb

     ip link add name vx0 up type vxlan id 10010 local 192.0.2.1 dstport 4789 proxy

     ip neigh add 192.0.2.3 lladdr 00:11:22:33:44:55 nud perm dev vx0

     bridge fdb add 00:11:22:33:44:55 dev vx0 self static nhid 10

     arping -b -c 1 -s 192.0.2.1 -I vx0 192.0.2.3

    [3]
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    [...]
    CPU: 13 UID: 0 PID: 372 Comm: ndisc6 Not tainted 6.17.0-rc2-virtmne-g6ee90cb26014 #3 PREEMPT(voluntary)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1v996), BIOS 1.17.0-4.fc41 04/01/2x014
    RIP: 0010:vxlan_xmit+0x803/0x1600
    [...]
    Call Trace:
     <TASK>
     dev_hard_start_xmit+0x5d/0x1c0
     __dev_queue_xmit+0x246/0xfd0
     ip6_finish_output2+0x210/0x6c0
     ip6_finish_output+0x1af/0x2b0
     ip6_mr_output+0x92/0x3e0
     ip6_send_skb+0x30/0x90
     rawv6_sendmsg+0xe6e/0x12e0
     __sock_sendmsg+0x38/0x70
     __sys_sendto+0x126/0x180
     __x64_sys_sendto+0x24/0x30
     do_syscall_64+0xa4/0x260
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    RIP: 0033:0x7f383422ec77

    [4]
     #!/bin/bash

     ip address add 2001:db8:1::1/128 dev lo

     ip nexthop add id 1 via 2001:db8:1::1 fdb
     ip nexthop add id 10 group 1 fdb

     ip link add name vx0 up type vxlan id 10010 local 2001:db8:1::1 dstport 4789 proxy

     ip neigh add 2001:db8:1::3 lladdr 00:11:22:33:44:55 nud perm dev vx0

     bridge fdb add 00:11:22:33:44:55 dev vx0 self static nhid 10

     ndisc6 -r 1 -s 2001:db8:1::1 -w 1 2001:db8:1::3 vx0

    Fixes: 1274e1c ("vxlan: ecmp support for mac fdb entries")
    Reviewed-by: Petr Machata <petrm@nvidia.com>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
    Link: https://patch.msgid.link/20250901065035.159644-3-idosch@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Guillaume Nault <gnault@redhat.com>
github-actions bot pushed a commit that referenced this pull request Nov 4, 2025
JIRA: https://issues.redhat.com/browse/RHEL-105612

commit a9c83a0
Author: Jens Axboe <axboe@kernel.dk>
Date:   Mon Dec 30 14:15:17 2024 -0700

    io_uring/timeout: flush timeouts outside of the timeout lock
    
    syzbot reports that a recent fix causes nesting issues between the (now)
    raw timeoutlock and the eventfd locking:
    
    =============================
    [ BUG: Invalid wait context ]
    6.13.0-rc4-00080-g9828a4c0901f #29 Not tainted
    -----------------------------
    kworker/u32:0/68094 is trying to lock:
    ffff000014d7a520 (&ctx->wqh#2){..-.}-{3:3}, at: eventfd_signal_mask+0x64/0x180
    other info that might help us debug this:
    context-{5:5}
    6 locks held by kworker/u32:0/68094:
     #0: ffff0000c1d98148 ((wq_completion)iou_exit){+.+.}-{0:0}, at: process_one_work+0x4e8/0xfc0
     #1: ffff80008d927c78 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x53c/0xfc0
     #2: ffff0000c59bc3d8 (&ctx->completion_lock){+.+.}-{3:3}, at: io_kill_timeouts+0x40/0x180
     #3: ffff0000c59bc358 (&ctx->timeout_lock){-.-.}-{2:2}, at: io_kill_timeouts+0x48/0x180
     #4: ffff800085127aa0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x8/0x38
     #5: ffff800085127aa0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x8/0x38
    stack backtrace:
    CPU: 7 UID: 0 PID: 68094 Comm: kworker/u32:0 Not tainted 6.13.0-rc4-00080-g9828a4c0901f #29
    Hardware name: linux,dummy-virt (DT)
    Workqueue: iou_exit io_ring_exit_work
    Call trace:
     show_stack+0x1c/0x30 (C)
     __dump_stack+0x24/0x30
     dump_stack_lvl+0x60/0x80
     dump_stack+0x14/0x20
     __lock_acquire+0x19f8/0x60c8
     lock_acquire+0x1a4/0x540
     _raw_spin_lock_irqsave+0x90/0xd0
     eventfd_signal_mask+0x64/0x180
     io_eventfd_signal+0x64/0x108
     io_req_local_work_add+0x294/0x430
     __io_req_task_work_add+0x1c0/0x270
     io_kill_timeout+0x1f0/0x288
     io_kill_timeouts+0xd4/0x180
     io_uring_try_cancel_requests+0x2e8/0x388
     io_ring_exit_work+0x150/0x550
     process_one_work+0x5e8/0xfc0
     worker_thread+0x7ec/0xc80
     kthread+0x24c/0x300
     ret_from_fork+0x10/0x20
    
    because after the preempt-rt fix for the timeout lock nesting inside
    the io-wq lock, we now have the eventfd spinlock nesting inside the
    raw timeout spinlock.
    
    Rather than play whack-a-mole with other nesting on the timeout lock,
    split the deletion and killing of timeouts so queueing the task_work
    for the timeout cancelations can get done outside of the timeout lock.
    
    Reported-by: syzbot+b1fc199a40b65d601b65@syzkaller.appspotmail.com
    Fixes: 020b40f ("io_uring: make ctx->timeout_lock a raw spinlock")
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
github-actions bot pushed a commit that referenced this pull request Nov 6, 2025
JIRA: https://issues.redhat.com/browse/RHEL-119009
Upstream Status: kernel/git/torvalds/linux.git

commit b1bf1a7
Author: Sheng Yong <shengyong1@xiaomi.com>
Date:   Thu Jul 10 14:48:55 2025 +0800

    dm-bufio: fix sched in atomic context

    If "try_verify_in_tasklet" is set for dm-verity, DM_BUFIO_CLIENT_NO_SLEEP
    is enabled for dm-bufio. However, when bufio tries to evict buffers, there
    is a chance to trigger scheduling in spin_lock_bh, the following warning
    is hit:

    BUG: sleeping function called from invalid context at drivers/md/dm-bufio.c:2745
    in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 123, name: kworker/2:2
    preempt_count: 201, expected: 0
    RCU nest depth: 0, expected: 0
    4 locks held by kworker/2:2/123:
     #0: ffff88800a2d1548 ((wq_completion)dm_bufio_cache){....}-{0:0}, at: process_one_work+0xe46/0x1970
     #1: ffffc90000d97d20 ((work_completion)(&dm_bufio_replacement_work)){....}-{0:0}, at: process_one_work+0x763/0x1970
     #2: ffffffff8555b528 (dm_bufio_clients_lock){....}-{3:3}, at: do_global_cleanup+0x1ce/0x710
     #3: ffff88801d5820b8 (&c->spinlock){....}-{2:2}, at: do_global_cleanup+0x2a5/0x710
    Preemption disabled at:
    [<0000000000000000>] 0x0
    CPU: 2 UID: 0 PID: 123 Comm: kworker/2:2 Not tainted 6.16.0-rc3-g90548c634bd0 #305 PREEMPT(voluntary)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
    Workqueue: dm_bufio_cache do_global_cleanup
    Call Trace:
     <TASK>
     dump_stack_lvl+0x53/0x70
     __might_resched+0x360/0x4e0
     do_global_cleanup+0x2f5/0x710
     process_one_work+0x7db/0x1970
     worker_thread+0x518/0xea0
     kthread+0x359/0x690
     ret_from_fork+0xf3/0x1b0
     ret_from_fork_asm+0x1a/0x30
     </TASK>

    That can be reproduced by:

      veritysetup format --data-block-size=4096 --hash-block-size=4096 /dev/vda /dev/vdb
      SIZE=$(blockdev --getsz /dev/vda)
      dmsetup create myverity -r --table "0 $SIZE verity 1 /dev/vda /dev/vdb 4096 4096 <data_blocks> 1 sha256 <root_hash> <salt> 1 try_verify_in_tasklet"
      mount /dev/dm-0 /mnt -o ro
      echo 102400 > /sys/module/dm_bufio/parameters/max_cache_size_bytes
      [read files in /mnt]

    Cc: stable@vger.kernel.org      # v6.4+
    Fixes: 450e8de ("dm bufio: improve concurrent IO performance")
    Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com>
    Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
PlaidCat added a commit that referenced this pull request Nov 6, 2025
jira LE-4669
cve CVE-2025-21725
Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10
commit-author Paulo Alcantara <pc@manguebit.com>
commit be7a6a7

It isn't guaranteed that NETWORK_INTERFACE_INFO::LinkSpeed will always
be set by the server, so the client must handle any values and then
prevent oopses like below from happening:

Oops: divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 1323 Comm: cat Not tainted 6.13.0-rc7 #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41
04/01/2014
RIP: 0010:cifs_debug_data_proc_show+0xa45/0x1460 [cifs] Code: 00 00 48
89 df e8 3b cd 1b c1 41 f6 44 24 2c 04 0f 84 50 01 00 00 48 89 ef e8
e7 d0 1b c1 49 8b 44 24 18 31 d2 49 8d 7c 24 28 <48> f7 74 24 18 48 89
c3 e8 6e cf 1b c1 41 8b 6c 24 28 49 8d 7c 24
RSP: 0018:ffffc90001817be0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88811230022c RCX: ffffffffc041bd99
RDX: 0000000000000000 RSI: 0000000000000567 RDI: ffff888112300228
RBP: ffff888112300218 R08: fffff52000302f5f R09: ffffed1022fa58ac
R10: ffff888117d2c566 R11: 00000000fffffffe R12: ffff888112300200
R13: 000000012a15343f R14: 0000000000000001 R15: ffff888113f2db58
FS: 00007fe27119e740(0000) GS:ffff888148600000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe2633c5000 CR3: 0000000124da0000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __die_body.cold+0x19/0x27
 ? die+0x2e/0x50
 ? do_trap+0x159/0x1b0
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? do_error_trap+0x90/0x130
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? exc_divide_error+0x39/0x50
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? asm_exc_divide_error+0x1a/0x20
 ? cifs_debug_data_proc_show+0xa39/0x1460 [cifs]
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? seq_read_iter+0x42e/0x790
 seq_read_iter+0x19a/0x790
 proc_reg_read_iter+0xbe/0x110
 ? __pfx_proc_reg_read_iter+0x10/0x10
 vfs_read+0x469/0x570
 ? do_user_addr_fault+0x398/0x760
 ? __pfx_vfs_read+0x10/0x10
 ? find_held_lock+0x8a/0xa0
 ? __pfx_lock_release+0x10/0x10
 ksys_read+0xd3/0x170
 ? __pfx_ksys_read+0x10/0x10
 ? __rcu_read_unlock+0x50/0x270
 ? mark_held_locks+0x1a/0x90
 do_syscall_64+0xbb/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe271288911
Code: 00 48 8b 15 01 25 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8
20 ad 01 00 f3 0f 1e fa 80 3d b5 a7 10 00 00 74 13 31 c0 0f 05 <48> 3d
00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec
RSP: 002b:00007ffe87c079d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007fe271288911
RDX: 0000000000040000 RSI: 00007fe2633c6000 RDI: 0000000000000003
RBP: 00007ffe87c07a00 R08: 0000000000000000 R09: 00007fe2713e6380
R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000040000
R13: 00007fe2633c6000 R14: 0000000000000003 R15: 0000000000000000
 </TASK>

Fix this by setting cifs_server_iface::speed to a sane value (1Gbps)
by default when link speed is unset.

	Cc: Shyam Prasad N <nspmangalore@gmail.com>
	Cc: Tom Talpey <tom@talpey.com>
Fixes: a6d8fb5 ("cifs: distribute channels across interfaces based on speed")
	Reported-by: Frank Sorenson <sorenson@redhat.com>
	Reported-by: Jay Shin <jaeshin@redhat.com>
	Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
	Signed-off-by: Steve French <stfrench@microsoft.com>
(cherry picked from commit be7a6a7)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
PlaidCat added a commit that referenced this pull request Nov 6, 2025
jira LE-4669
cve CVE-2025-38244
Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10
commit-author Paulo Alcantara <pc@manguebit.org>
commit 711741f
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/711741f9.failed

Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order
and prevent the following deadlock from happening

======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc3-build2+ #1301 Tainted: G S      W
------------------------------------------------------
cifsd/6055 is trying to acquire lock:
ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200

but task is already holding lock:
ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&ret_buf->chan_lock){+.+.}-{3:3}:
       validate_chain+0x1cf/0x270
       __lock_acquire+0x60e/0x780
       lock_acquire.part.0+0xb4/0x1f0
       _raw_spin_lock+0x2f/0x40
       cifs_setup_session+0x81/0x4b0
       cifs_get_smb_ses+0x771/0x900
       cifs_mount_get_session+0x7e/0x170
       cifs_mount+0x92/0x2d0
       cifs_smb3_do_mount+0x161/0x460
       smb3_get_tree+0x55/0x90
       vfs_get_tree+0x46/0x180
       do_new_mount+0x1b0/0x2e0
       path_mount+0x6ee/0x740
       do_mount+0x98/0xe0
       __do_sys_mount+0x148/0x180
       do_syscall_64+0xa4/0x260
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #1 (&ret_buf->ses_lock){+.+.}-{3:3}:
       validate_chain+0x1cf/0x270
       __lock_acquire+0x60e/0x780
       lock_acquire.part.0+0xb4/0x1f0
       _raw_spin_lock+0x2f/0x40
       cifs_match_super+0x101/0x320
       sget+0xab/0x270
       cifs_smb3_do_mount+0x1e0/0x460
       smb3_get_tree+0x55/0x90
       vfs_get_tree+0x46/0x180
       do_new_mount+0x1b0/0x2e0
       path_mount+0x6ee/0x740
       do_mount+0x98/0xe0
       __do_sys_mount+0x148/0x180
       do_syscall_64+0xa4/0x260
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #0 (&tcp_ses->srv_lock){+.+.}-{3:3}:
       check_noncircular+0x95/0xc0
       check_prev_add+0x115/0x2f0
       validate_chain+0x1cf/0x270
       __lock_acquire+0x60e/0x780
       lock_acquire.part.0+0xb4/0x1f0
       _raw_spin_lock+0x2f/0x40
       cifs_signal_cifsd_for_reconnect+0x134/0x200
       __cifs_reconnect+0x8f/0x500
       cifs_handle_standard+0x112/0x280
       cifs_demultiplex_thread+0x64d/0xbc0
       kthread+0x2f7/0x310
       ret_from_fork+0x2a/0x230
       ret_from_fork_asm+0x1a/0x30

other info that might help us debug this:

Chain exists of:
  &tcp_ses->srv_lock --> &ret_buf->ses_lock --> &ret_buf->chan_lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ret_buf->chan_lock);
                               lock(&ret_buf->ses_lock);
                               lock(&ret_buf->chan_lock);
  lock(&tcp_ses->srv_lock);

 *** DEADLOCK ***

3 locks held by cifsd/6055:
 #0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200
 #1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200
 #2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200

	Cc: linux-cifs@vger.kernel.org
	Reported-by: David Howells <dhowells@redhat.com>
Fixes: d7d7a66 ("cifs: avoid use of global locks for high contention data")
	Reviewed-by: David Howells <dhowells@redhat.com>
	Tested-by: David Howells <dhowells@redhat.com>
	Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
	Signed-off-by: David Howells <dhowells@redhat.com>
	Signed-off-by: Steve French <stfrench@microsoft.com>
(cherry picked from commit 711741f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	fs/cifs/cifsglob.h
#	fs/cifs/connect.c
github-actions bot pushed a commit that referenced this pull request Nov 7, 2025
Michael Chan says:

====================
bnxt_en: Bug fixes

Patches 1, 3, and 4 are bug fixes related to the FW log tracing driver
coredump feature recently added in 6.13.  Patch #1 adds the necessary
call to shutdown the FW logging DMA during PCI shutdown.  Patch #3 fixes
a possible null pointer derefernce when using early versions of the FW
with this feature.  Patch #4 adds the coredump header information
unconditionally to make it more robust.

Patch #2 fixes a possible memory leak during PTP shutdown.  Patch #5
eliminates a dmesg warning when doing devlink reload.
====================

Link: https://patch.msgid.link/20251104005700.542174-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
github-actions bot pushed a commit that referenced this pull request Nov 8, 2025
On completion of i915_vma_pin_ww(), a synchronous variant of
dma_fence_work_commit() is called.  When pinning a VMA to GGTT address
space on a Cherry View family processor, or on a Broxton generation SoC
with VTD enabled, i.e., when stop_machine() is then called from
intel_ggtt_bind_vma(), that can potentially lead to lock inversion among
reservation_ww and cpu_hotplug locks.

[86.861179] ======================================================
[86.861193] WARNING: possible circular locking dependency detected
[86.861209] 6.15.0-rc5-CI_DRM_16515-gca0305cadc2d+ #1 Tainted: G     U
[86.861226] ------------------------------------------------------
[86.861238] i915_module_loa/1432 is trying to acquire lock:
[86.861252] ffffffff83489090 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x1c/0x50
[86.861290]
but task is already holding lock:
[86.861303] ffffc90002e0b4c8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915]
[86.862233]
which lock already depends on the new lock.
[86.862251]
the existing dependency chain (in reverse order) is:
[86.862265]
-> #5 (reservation_ww_class_mutex){+.+.}-{3:3}:
[86.862292]        dma_resv_lockdep+0x19a/0x390
[86.862315]        do_one_initcall+0x60/0x3f0
[86.862334]        kernel_init_freeable+0x3cd/0x680
[86.862353]        kernel_init+0x1b/0x200
[86.862369]        ret_from_fork+0x47/0x70
[86.862383]        ret_from_fork_asm+0x1a/0x30
[86.862399]
-> #4 (reservation_ww_class_acquire){+.+.}-{0:0}:
[86.862425]        dma_resv_lockdep+0x178/0x390
[86.862440]        do_one_initcall+0x60/0x3f0
[86.862454]        kernel_init_freeable+0x3cd/0x680
[86.862470]        kernel_init+0x1b/0x200
[86.862482]        ret_from_fork+0x47/0x70
[86.862495]        ret_from_fork_asm+0x1a/0x30
[86.862509]
-> #3 (&mm->mmap_lock){++++}-{3:3}:
[86.862531]        down_read_killable+0x46/0x1e0
[86.862546]        lock_mm_and_find_vma+0xa2/0x280
[86.862561]        do_user_addr_fault+0x266/0x8e0
[86.862578]        exc_page_fault+0x8a/0x2f0
[86.862593]        asm_exc_page_fault+0x27/0x30
[86.862607]        filldir64+0xeb/0x180
[86.862620]        kernfs_fop_readdir+0x118/0x480
[86.862635]        iterate_dir+0xcf/0x2b0
[86.862648]        __x64_sys_getdents64+0x84/0x140
[86.862661]        x64_sys_call+0x1058/0x2660
[86.862675]        do_syscall_64+0x91/0xe90
[86.862689]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[86.862703]
-> #2 (&root->kernfs_rwsem){++++}-{3:3}:
[86.862725]        down_write+0x3e/0xf0
[86.862738]        kernfs_add_one+0x30/0x3c0
[86.862751]        kernfs_create_dir_ns+0x53/0xb0
[86.862765]        internal_create_group+0x134/0x4c0
[86.862779]        sysfs_create_group+0x13/0x20
[86.862792]        topology_add_dev+0x1d/0x30
[86.862806]        cpuhp_invoke_callback+0x4b5/0x850
[86.862822]        cpuhp_issue_call+0xbf/0x1f0
[86.862836]        __cpuhp_setup_state_cpuslocked+0x111/0x320
[86.862852]        __cpuhp_setup_state+0xb0/0x220
[86.862866]        topology_sysfs_init+0x30/0x50
[86.862879]        do_one_initcall+0x60/0x3f0
[86.862893]        kernel_init_freeable+0x3cd/0x680
[86.862908]        kernel_init+0x1b/0x200
[86.862921]        ret_from_fork+0x47/0x70
[86.862934]        ret_from_fork_asm+0x1a/0x30
[86.862947]
-> #1 (cpuhp_state_mutex){+.+.}-{3:3}:
[86.862969]        __mutex_lock+0xaa/0xed0
[86.862982]        mutex_lock_nested+0x1b/0x30
[86.862995]        __cpuhp_setup_state_cpuslocked+0x67/0x320
[86.863012]        __cpuhp_setup_state+0xb0/0x220
[86.863026]        page_alloc_init_cpuhp+0x2d/0x60
[86.863041]        mm_core_init+0x22/0x2d0
[86.863054]        start_kernel+0x576/0xbd0
[86.863068]        x86_64_start_reservations+0x18/0x30
[86.863084]        x86_64_start_kernel+0xbf/0x110
[86.863098]        common_startup_64+0x13e/0x141
[86.863114]
-> #0 (cpu_hotplug_lock){++++}-{0:0}:
[86.863135]        __lock_acquire+0x1635/0x2810
[86.863152]        lock_acquire+0xc4/0x2f0
[86.863166]        cpus_read_lock+0x41/0x100
[86.863180]        stop_machine+0x1c/0x50
[86.863194]        bxt_vtd_ggtt_insert_entries__BKL+0x3b/0x60 [i915]
[86.863987]        intel_ggtt_bind_vma+0x43/0x70 [i915]
[86.864735]        __vma_bind+0x55/0x70 [i915]
[86.865510]        fence_work+0x26/0xa0 [i915]
[86.866248]        fence_notify+0xa1/0x140 [i915]
[86.866983]        __i915_sw_fence_complete+0x8f/0x270 [i915]
[86.867719]        i915_sw_fence_commit+0x39/0x60 [i915]
[86.868453]        i915_vma_pin_ww+0x462/0x1360 [i915]
[86.869228]        i915_vma_pin.constprop.0+0x133/0x1d0 [i915]
[86.870001]        initial_plane_vma+0x307/0x840 [i915]
[86.870774]        intel_initial_plane_config+0x33f/0x670 [i915]
[86.871546]        intel_display_driver_probe_nogem+0x1c6/0x260 [i915]
[86.872330]        i915_driver_probe+0x7fa/0xe80 [i915]
[86.873057]        i915_pci_probe+0xe6/0x220 [i915]
[86.873782]        local_pci_probe+0x47/0xb0
[86.873802]        pci_device_probe+0xf3/0x260
[86.873817]        really_probe+0xf1/0x3c0
[86.873833]        __driver_probe_device+0x8c/0x180
[86.873848]        driver_probe_device+0x24/0xd0
[86.873862]        __driver_attach+0x10f/0x220
[86.873876]        bus_for_each_dev+0x7f/0xe0
[86.873892]        driver_attach+0x1e/0x30
[86.873904]        bus_add_driver+0x151/0x290
[86.873917]        driver_register+0x5e/0x130
[86.873931]        __pci_register_driver+0x7d/0x90
[86.873945]        i915_pci_register_driver+0x23/0x30 [i915]
[86.874678]        i915_init+0x37/0x120 [i915]
[86.875347]        do_one_initcall+0x60/0x3f0
[86.875369]        do_init_module+0x97/0x2a0
[86.875385]        load_module+0x2c54/0x2d80
[86.875398]        init_module_from_file+0x96/0xe0
[86.875413]        idempotent_init_module+0x117/0x330
[86.875426]        __x64_sys_finit_module+0x77/0x100
[86.875440]        x64_sys_call+0x24de/0x2660
[86.875454]        do_syscall_64+0x91/0xe90
[86.875470]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[86.875486]
other info that might help us debug this:
[86.875502] Chain exists of:
  cpu_hotplug_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex
[86.875539]  Possible unsafe locking scenario:
[86.875552]        CPU0                    CPU1
[86.875563]        ----                    ----
[86.875573]   lock(reservation_ww_class_mutex);
[86.875588]                                lock(reservation_ww_class_acquire);
[86.875606]                                lock(reservation_ww_class_mutex);
[86.875624]   rlock(cpu_hotplug_lock);
[86.875637]
 *** DEADLOCK ***
[86.875650] 3 locks held by i915_module_loa/1432:
[86.875663]  #0: ffff888101f5c1b0 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x104/0x220
[86.875699]  #1: ffffc90002e0b4a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915]
[86.876512]  #2: ffffc90002e0b4c8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915]
[86.877305]
stack backtrace:
[86.877326] CPU: 0 UID: 0 PID: 1432 Comm: i915_module_loa Tainted: G     U              6.15.0-rc5-CI_DRM_16515-gca0305cadc2d+ #1 PREEMPT(voluntary)
[86.877334] Tainted: [U]=USER
[86.877336] Hardware name:  /NUC5CPYB, BIOS PYBSWCEL.86A.0079.2020.0420.1316 04/20/2020
[86.877339] Call Trace:
[86.877344]  <TASK>
[86.877353]  dump_stack_lvl+0x91/0xf0
[86.877364]  dump_stack+0x10/0x20
[86.877369]  print_circular_bug+0x285/0x360
[86.877379]  check_noncircular+0x135/0x150
[86.877390]  __lock_acquire+0x1635/0x2810
[86.877403]  lock_acquire+0xc4/0x2f0
[86.877408]  ? stop_machine+0x1c/0x50
[86.877422]  ? __pfx_bxt_vtd_ggtt_insert_entries__cb+0x10/0x10 [i915]
[86.878173]  cpus_read_lock+0x41/0x100
[86.878182]  ? stop_machine+0x1c/0x50
[86.878191]  ? __pfx_bxt_vtd_ggtt_insert_entries__cb+0x10/0x10 [i915]
[86.878916]  stop_machine+0x1c/0x50
[86.878927]  bxt_vtd_ggtt_insert_entries__BKL+0x3b/0x60 [i915]
[86.879652]  intel_ggtt_bind_vma+0x43/0x70 [i915]
[86.880375]  __vma_bind+0x55/0x70 [i915]
[86.881133]  fence_work+0x26/0xa0 [i915]
[86.881851]  fence_notify+0xa1/0x140 [i915]
[86.882566]  __i915_sw_fence_complete+0x8f/0x270 [i915]
[86.883286]  i915_sw_fence_commit+0x39/0x60 [i915]
[86.884003]  i915_vma_pin_ww+0x462/0x1360 [i915]
[86.884756]  ? i915_vma_pin.constprop.0+0x6c/0x1d0 [i915]
[86.885513]  i915_vma_pin.constprop.0+0x133/0x1d0 [i915]
[86.886281]  initial_plane_vma+0x307/0x840 [i915]
[86.887049]  intel_initial_plane_config+0x33f/0x670 [i915]
[86.887819]  intel_display_driver_probe_nogem+0x1c6/0x260 [i915]
[86.888587]  i915_driver_probe+0x7fa/0xe80 [i915]
[86.889293]  ? mutex_unlock+0x12/0x20
[86.889301]  ? drm_privacy_screen_get+0x171/0x190
[86.889308]  ? acpi_dev_found+0x66/0x80
[86.889321]  i915_pci_probe+0xe6/0x220 [i915]
[86.890038]  local_pci_probe+0x47/0xb0
[86.890049]  pci_device_probe+0xf3/0x260
[86.890058]  really_probe+0xf1/0x3c0
[86.890067]  __driver_probe_device+0x8c/0x180
[86.890072]  driver_probe_device+0x24/0xd0
[86.890078]  __driver_attach+0x10f/0x220
[86.890083]  ? __pfx___driver_attach+0x10/0x10
[86.890088]  bus_for_each_dev+0x7f/0xe0
[86.890097]  driver_attach+0x1e/0x30
[86.890101]  bus_add_driver+0x151/0x290
[86.890107]  driver_register+0x5e/0x130
[86.890113]  __pci_register_driver+0x7d/0x90
[86.890119]  i915_pci_register_driver+0x23/0x30 [i915]
[86.890833]  i915_init+0x37/0x120 [i915]
[86.891482]  ? __pfx_i915_init+0x10/0x10 [i915]
[86.892135]  do_one_initcall+0x60/0x3f0
[86.892145]  ? __kmalloc_cache_noprof+0x33f/0x470
[86.892157]  do_init_module+0x97/0x2a0
[86.892164]  load_module+0x2c54/0x2d80
[86.892168]  ? __kernel_read+0x15c/0x300
[86.892185]  ? kernel_read_file+0x2b1/0x320
[86.892195]  init_module_from_file+0x96/0xe0
[86.892199]  ? init_module_from_file+0x96/0xe0
[86.892211]  idempotent_init_module+0x117/0x330
[86.892224]  __x64_sys_finit_module+0x77/0x100
[86.892230]  x64_sys_call+0x24de/0x2660
[86.892236]  do_syscall_64+0x91/0xe90
[86.892243]  ? irqentry_exit+0x77/0xb0
[86.892249]  ? sysvec_apic_timer_interrupt+0x57/0xc0
[86.892256]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[86.892261] RIP: 0033:0x7303e1b2725d
[86.892271] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48
[86.892276] RSP: 002b:00007ffddd1fdb38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[86.892281] RAX: ffffffffffffffda RBX: 00005d771d88fd90 RCX: 00007303e1b2725d
[86.892285] RDX: 0000000000000000 RSI: 00005d771d893aa0 RDI: 000000000000000c
[86.892287] RBP: 00007ffddd1fdbf0 R08: 0000000000000040 R09: 00007ffddd1fdb80
[86.892289] R10: 00007303e1c03b20 R11: 0000000000000246 R12: 00005d771d893aa0
[86.892292] R13: 0000000000000000 R14: 00005d771d88f0d0 R15: 00005d771d895710
[86.892304]  </TASK>

Call asynchronous variant of dma_fence_work_commit() in that case.

v3: Provide more verbose in-line comment (Andi),
  - mention target environments in commit message.

Fixes: 7d1c261 ("drm/i915: Take reservation lock around i915_vma_pin.")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14985
Cc: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Acked-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20251023082925.351307-6-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 648ef1324add1c2e2b6041cdf0b28d31fbca5f13)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
github-actions bot pushed a commit that referenced this pull request Nov 8, 2025
When a connector is connected but inactive (e.g., disabled by desktop
environments), pipe_ctx->stream_res.tg will be destroyed. Then, reading
odm_combine_segments causes kernel NULL pointer dereference.

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 16 UID: 0 PID: 26474 Comm: cat Not tainted 6.17.0+ #2 PREEMPT(lazy)  e6a17af9ee6db7c63e9d90dbe5b28ccab67520c6
 Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN25WW 03/27/2025
 RIP: 0010:odm_combine_segments_show+0x93/0xf0 [amdgpu]
 Code: 41 83 b8 b0 00 00 00 01 75 6e 48 98 ba a1 ff ff ff 48 c1 e0 0c 48 8d 8c 07 d8 02 00 00 48 85 c9 74 2d 48 8b bc 07 f0 08 00 00 <48> 8b 07 48 8b 80 08 02 00>
 RSP: 0018:ffffd1bf4b953c58 EFLAGS: 00010286
 RAX: 0000000000005000 RBX: ffff8e35976b02d0 RCX: ffff8e3aeed052d8
 RDX: 00000000ffffffa1 RSI: ffff8e35a3120800 RDI: 0000000000000000
 RBP: 0000000000000000 R08: ffff8e3580eb0000 R09: ffff8e35976b02d0
 R10: ffffd1bf4b953c78 R11: 0000000000000000 R12: ffffd1bf4b953d08
 R13: 0000000000040000 R14: 0000000000000001 R15: 0000000000000001
 FS:  00007f44d3f9f740(0000) GS:ffff8e3caa47f000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000006485c2000 CR4: 0000000000f50ef0
 PKRU: 55555554
 Call Trace:
  <TASK>
  seq_read_iter+0x125/0x490
  ? __alloc_frozen_pages_noprof+0x18f/0x350
  seq_read+0x12c/0x170
  full_proxy_read+0x51/0x80
  vfs_read+0xbc/0x390
  ? __handle_mm_fault+0xa46/0xef0
  ? do_syscall_64+0x71/0x900
  ksys_read+0x73/0xf0
  do_syscall_64+0x71/0x900
  ? count_memcg_events+0xc2/0x190
  ? handle_mm_fault+0x1d7/0x2d0
  ? do_user_addr_fault+0x21a/0x690
  ? exc_page_fault+0x7e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x6c/0x74
 RIP: 0033:0x7f44d4031687
 Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00>
 RSP: 002b:00007ffdb4b5f0b0 EFLAGS: 00000202 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 00007f44d3f9f740 RCX: 00007f44d4031687
 RDX: 0000000000040000 RSI: 00007f44d3f5e000 RDI: 0000000000000003
 RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000202 R12: 00007f44d3f5e000
 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000040000
  </TASK>
 Modules linked in: tls tcp_diag inet_diag xt_mark ccm snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device x>
  snd_hda_codec_atihdmi snd_hda_codec_realtek_lib lenovo_wmi_helpers think_lmi snd_hda_codec_generic snd_hda_codec_hdmi snd_soc_core kvm snd_compress uvcvideo sn>
  platform_profile joydev amd_pmc mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics loop nfnetlink ip_tables x_tables dm_cryp>
 CR2: 0000000000000000
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:odm_combine_segments_show+0x93/0xf0 [amdgpu]
 Code: 41 83 b8 b0 00 00 00 01 75 6e 48 98 ba a1 ff ff ff 48 c1 e0 0c 48 8d 8c 07 d8 02 00 00 48 85 c9 74 2d 48 8b bc 07 f0 08 00 00 <48> 8b 07 48 8b 80 08 02 00>
 RSP: 0018:ffffd1bf4b953c58 EFLAGS: 00010286
 RAX: 0000000000005000 RBX: ffff8e35976b02d0 RCX: ffff8e3aeed052d8
 RDX: 00000000ffffffa1 RSI: ffff8e35a3120800 RDI: 0000000000000000
 RBP: 0000000000000000 R08: ffff8e3580eb0000 R09: ffff8e35976b02d0
 R10: ffffd1bf4b953c78 R11: 0000000000000000 R12: ffffd1bf4b953d08
 R13: 0000000000040000 R14: 0000000000000001 R15: 0000000000000001
 FS:  00007f44d3f9f740(0000) GS:ffff8e3caa47f000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000006485c2000 CR4: 0000000000f50ef0
 PKRU: 55555554

Fix this by checking pipe_ctx->stream_res.tg before dereferencing.

Fixes: 07926ba ("drm/amd/display: Add debugfs interface for ODM combine info")
Signed-off-by: Rong Zhang <i@rong.moe>
Reviewed-by: Mario Limoncello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f19bbecd34e3c15eed7e5e593db2ac0fc7a0e6d8)
Cc: stable@vger.kernel.org
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants